You are on page 1of 8

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 7, JULY 2010, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 177

Syntactic Trimming of Extracted Sentences


for Improving Extractive Multi-document
Summarization
Kamal Sarkar

Abstract— For managing a vast hoard of online or offline information, summarization can be the useful means because the users can de-
cide about the relevance of an individual document or a document cluster using just summary information. Multi-document summaries can
enable the users to identify the main theme (central idea) of a cluster of texts very rapidly. This paper presents a sentence compression
based summarization technique that uses a number of local and global sentence-trimming rules to improve the performance of an extractive
multi-document summarization system. For our experiments, we develop (1) a primary summarization system, which extracts sentences to
form a draft summary and (2) a trimming component, which accepts a draft summary for revision. The trimming component eliminates the low
content and redundant elements from the sentences in the draft summaries using a number of local and global sentence-trimming rules with-
out hampering the grammaticality and the fluency of the summaries. In effect, the trimming process makes rooms for more diverse and sa-
lient units to appear in a summary. Our test results on DUC 2004 data set show that the summarization system, which integrates both the
extraction component and the trimming component, performs better than some state-of-the art summarization approaches.

Index Terms— Information Overload, Multi-document summarization, Natural Language Processing, Syntactic Trimming

——————————  ——————————

1 INTRODUCTION

T he number of pages available on the web almost


doubles every year. As a result, the search engine in
response to a single query returns a large number of
ing redundancies and (3) reordering or fusion of the units
to produce a fluent summary.
Some previous extractive summarization projects [1][2][3]
web pages and it is sometimes very difficult for the users have used a sentence ranking approach that utilizes a number of
to go through all the hits and find the relevant informa- word level and sentence level features such as sentence posi-
tion from the collection. For managing a vast hoard of infor- tion, term frequency, or cue phrases for ranking sentences. Top-
mation, summarization can be a useful means because users can K sentences are then selected based on a compression
decide about the relevance of an individual document using just ratio.
summary information and multi-document summaries can also In another approach to multi-document summariza-
enable users to identify the main theme (central idea) of a clus- tion [4], text summarization was viewed as a problem of
ter of texts very rapidly. summarizing the similarities and differences in informa-
Multi-document summarization is a process, which produces tion content among the documents in a collection.
a condensed representation of the contents of multiple related Redundancy is one of the important issues in multi-
text documents collected from heterogeneous sources for hu- document summarization. To remove redundancy, some
man consumption and facilitates very rapid assimilation by systems select the top most sentence first and measure the
human beings of the main points from related documents. If similarity of a next candidate textual unit (sentence or
this kind of summarization facility is available on the paragraph) to that of previously selected ones and retain
web, users may initially summarize a document collection it only if it contains enough new (dissimilar) information.
with the help of a summarization facility and go through A popular such measure is maximal marginal relevance
the gist to decide whether to click on the collection for in- [5].
depth study. The major steps of the extractive multi- Unlike the approach in [5] that uses greedy approach
document summarization are: (1) extracting important to sentence selection and redundancy removal, the clus-
textual units from multiple related documents, (2) remov- tering-based approach controls redundancies in the final
summary by clustering sentences to identify themes of
———————————————— common information and selecting one or two representa-
 The author is with the Computer Science & Engineering Department, tive sentences from each cluster in to the final summary
Jadavpur University, Kolkata – 700 032, India, [6][7].
Centroid-based multi-document summarization [8][9]
ranks sentences based on their similarities to the centroid.
A centroid of a cluster of documents is defined as a pseu-
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 7, JULY 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 178
do-document consisting of words with TF*IDF scores local sentence compression methods [19] that use the
greater than a threshold, where TF means term frequency parsed sentences as input to a trimmer. Since the com-
which has been computed by the average number of oc- plete parsing of a group of sentences takes more time
currences of a word across a set of the documents to be than the tagging time of the same and the trimming quali-
summarized and IDF (inverse document frequency) signi- ty is affected by a parser’s parsing quality (which may
fying the rarity of a word in a text corpus, is inversely differ from one parser to another), we use the shallow
proportional to the document frequency of a word. Doc- parsing on a POS tagged output before applying the
ument frequency of a word is defined as the number of trimming rules.
documents that contain the word. Document frequency is The BE (basic elements)-based sentence compression
the corpus statistics of a word and computed on a large [17] algorithm also requires parsing of the sentences and
collection of documents. breaking sentences in to a number of basic elements using
Few approaches identify repetitive phrases from a BE package [20]. Compared to the BE-based sentence
cluster of documents and use information fusion tech- compression algorithm that concentrates only on remov-
niques to form a fluent summary [10]. In extractive multi- ing the redundant BEs (basic elements) from a summary,
document summarization task, summary sentences come our summary compression algorithm eliminates unim-
from multiple source documents, and picking sentences portant portions of the individual sentences and phrase
out of context may result in an incoherent summary. So, level redundancy across the summary sentences.
to increase the readability of the produced summary, Recently, a sentence trimming based summary revi-
most systems follow time order and text order (passages sion approach has been presented in [21], where authors
from the oldest text appear first, sorted in the order in have shown an improvement in the performance of the
which they appear in the input texts) [11]. Approaches in multi-document summarization through sentence trim-
[12] use temporal and coherence constraints to order sen- ming, but comparisons with state-of-the art summariza-
tences. tion approaches have not been presented. Moreover, they
The most summarization systems are either (1) sen- have evaluated the summaries with the help of old ver-
tence extraction based or (2) using sentence extraction as a sion of ROUGE packages (ROUGEeval-1.4.2).
primary component of a system. Sentence extraction is The work described in this paper is an improvement
only the first half of the summarization problem. It was over our previous work presented in [21]. Compared to
shown in [13] that the revisions of multi-document sum- our previous work, the present work use a simple ap-
maries improve summary coherence. Some other re- proach for sentence extraction and apply more number of
searches have also focused on the possibility of improv- trimming rules to show that the proposed trimming com-
ing summarization performance through summary revi- ponent can be connected to the output of a simple extrac-
sions [14]. Lin [15] presents a pilot study on improving tive summarizer for improved summarization perfor-
the sentence extraction based summarization perfor- mance which is comparable to the performance of various
mance by sentence compression. This pilot study reveals state-of- the art summarization systems. We evaluate
that local optimization at sentence level even using a summaries using the ROUGE package version 1.5.5.
good compression algorithm proposed by Knight and In the section 2 we describe a sentence extraction me-
Marcu [16] is not enough to boost a system performance thod, which we use to implement our primary summari-
because the basic goal is to find the best compressed zation system for draft summary generation. The pro-
summaries not the best compressed sentences. They con- posed syntactic trimming component has been covered in
clude that global cross sentence optimization may boost a section 3. The experiments and results are discussed in
system performance, but they did not present any global section 4.
optimization method in their paper. A basic element (BE)
-based sentence compression algorithm has been dis-
2 SENTENCE EXTRACTION METHOD
cussed in [17].
Compared to the extractive summarization approaches We re-implement a centroid based sentence extraction
discussed above, our summarization system has one ad- method [8] for draft summary generation. Our imple-
ditional component, which we call trimming component. mentation of the centroid based sentence extraction me-
In effect, our system consists of two major components: thod consists of a number of steps: (1) Document prepro-
(1) a sentence extraction component that extracts sen- cessing (2) sentence ranking (3) draft summary genera-
tences to form a draft summary and (2) a trimming com- tion.
ponent that applies local and global syntactic trimming
rules to a draft summary for improvement. The local 2.1 Preprocessing
trimming rules eliminate unimportant elements from the The preprocessing task primarily includes handling
individual sentences and the global trimming rules elimi- abbreviations or numeric words containing dots (.) in it.
nate phrase level redundancy across the summary sen- The dots in abbreviations and numeric words (ex. 12.5
tences. Thus, the trimming makes rooms for the more millions) may mistakenly be recognized as a sentence
diverse and salient information to appear in a summary. boundary. We have used a number of syntactic rules to
In our approach, the draft summaries are tagged by a differentiate between dots in abbreviations, numeric
part-of-speech (POS) tagger [18] before their submission words and dots at the end of a sentence. We also use a
to the trimming component. Some early works focus on list of the predefined abbreviations to prevent sentences
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 7, JULY 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 179
from being mistakenly broken at the dots used in the ab- the trimming component. The draft summaries are pro-
breviations. duced after ranking the sentences based on their scores
Before applying sentence-ranking algorithm, input and selecting K-top ranked sentences. In the process of
documents are broken in to a cluster of sentences. draft summary generation, system selects first the top
ranked sentence and continues choosing sentences until
2.2 Sentence Ranking the desired summary length is reached. While selecting a
After the input documents are formatted and seg- sentence, it is compared to the already selected sentences
mented, the sentences are ranked based on two important to verify whether this sentence contain sufficient dissimi-
features: similarity to the centroid and the positional in- lar information. To do so, a similarity between the sen-
formation. These two important features have successful- tences under consideration and the already selected sen-
ly been incorporated into the extractive multi-document tences should be measured. If in any case it is found that
summarizer called MEAD [9]. the similarity value is greater than a predefined threshold
value, the sentence under consideration is not included in
Centroid
to a summary.
A centroid is a pseudo-document consisting of words The cosine similarity metric is used to measure the si-
with TF*IDF scores greater than a predefined threshold milarity between two sentences. If the cosine similarity
and TF ( term frequency ) = Average occurrences of a between two sentences is greater (less) than a threshold,
word across the input collection of the documents to be we say that the sentences are similar (dissimilar). The co-
summarized and IDF (Inverse Document Frequency) is sine similarity between two sentences is measured by the
computed on a corpus using the formula: IDF=log(N/df) following formula as stated in [22].
where N=number of documents in the corpus and df
(document frequency) indicates the number of documents idf-modified-cosine(x,y) =
in which a word occurs. The centroid score of a sentence
2
Ck is computed as the sum of the centroid values of all  x , y tf , x tf , y (idf )
words in a sentence.
2 2
 xi x (tf xi , x idf xi )   yi  y (tf yi , y idf yi )
Ck  C w
w, k
where t f  , S is the number of occurrences of the
word  in the sentence S.
where Cw,k is a centroid value of a word w in a sen- We generate a draft summary of 200 words with the
tence k. target of generating the final summary of 665 bytes (ap-
proximately 100 words).
Positional Value
The positional value is computed as follows: the first 3 PROPOSED SYNTACTIC TRIMMING METHOD
sentence gets the highest score and as the sentence num-
ber increases, the positional value decreases. The posi- The size of a draft summary is fixed to 200 words and
tional value for the sentence k is computed using follow- all the draft summaries generated by the sentence extrac-
ing formula: tion component are tagged by a POS tagger. Then the
trimming rules are applied one by one on the tagged draft
summaries for revising them. From a trimmed draft
1
Pk  summary, a final summary of 665 bytes is generated. For
k trimming, the tagged draft summaries are scanned from
top to bottom and the trimming rules are tried on one
sentence at a time.
Combining Parameters for Sentence Ranking We categorize the trimming rules as local and global
We compute the sentence score using the linear combina- trimming rules. When we remove low content words
tion of the two parameters: centroid and the positional from a sentence, we call it as local trimming and when we
value. We use the following formula for combining these remove a redundant constituent from a sentence in con-
two parameters. sultation with other sentences in a draft summary we call
it as global trimming.
S c o re ( S k )   ( C k C m ax )   Pk 3.1 Local Trimming
The local trimming rules and their application to the
where Cmax indicates the centroid score of a sentence
English sentences are shown below (Rules are written in
which gets the maximum centroid score in an input col-
italics font and numbered as R1, R2 etc.).
lection of sentences. For our experiments, we set:  = 
=1.
R1: Delete time words such as "Monday" and time expres-
2.3 Draft Summary Generation sions such as "on Monday".
In our summarization approach, the draft summaries
produced by the extraction component become input to The following patterns of low significant time words
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 7, JULY 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 180
have been identified from the news corpus. Tagged Input:
[But/CC with/IN exorbitant/JJ salaries/NNS
<Prep + Day> (Ex. On Sunday) paid/VBN to/TO several/JJ unproven/JJ stars/NNS
<Day> (Ex. Sunday) over/IN the/DT last/JJ few/JJ years/NNS and/CC
<Day + night> (Ex. Sunday night) 100/CD million/CD deals/NNS sprouting/VB…]
<prep + late+ Day > (Ex. By late Sunday)
<since + last+ Day> (Ex. since last Tuesday) After Application of the rule R3A:

Besides the above-mentioned patterns, the following time word [But/CC with/IN exorbitant/JJ salaries/NNS
sequences have also been identified from the corpus. paid/VBN to/TO unproven/JJ stars/NNS over/IN
the/DT few/JJ years/NNS and/CC 100/CD mil-
“last year", "last week", "last month", "the first time", “next lion/CD deals/NNS sprouting/VB…]
month” , “this year”
R4: Delete the information source (such as a “spokesman
R2: Remove low content adverbs from the sentences said today”) appears at the end of a sentence with the pattern
<comma+ segment + dot>, where segment do not contain any
The tagger assigns the tag <RB> for the adverb. We named entity, but it contains words such as “reported”, “said”
identified few exceptions where this rule should not be etc..
applicable. Some words such as “ago”, “well”, “earlier”,
“before”, “not”, are tagged as <RB>, but they should not The segment indicating news source information
be deleted to maintain the fluency of a summary usually starts with a comma (,) and ends with a period (.).
The constituents to be deleted in the following exam- This segment is more distinctly identified by a list of do-
ple is shown in bold italic. main specific keywords and phrases such as “reported”,
“announced”, “ according to”, “officials said”
Tagged input:
[Portuguese/JJ writer/NN who/WP took/VBD up/IN R5: If a sentence starts with a PP with the pattern <IN +
literature/NN relatively/RB late/JJ in/IN life/NN NP + Comma>, delete it if noun heads in the PP are
and/CC whose/WP$ richly/RB imaginative/JJ no- lightweight. Here the tag <IN> indicates preposition and NP
vels/NNS soon/RB won/VBD him/PRP a/DT follow- indicates noun phrase.
ing/VBG of/IN loyal/JJ. . .]
Here, lightweight means the weight is less than a thre-
After application of rule R2: shold (2.2 in this setting) and PP means prepositional
[Portuguese/JJ writer/NN who/WP took/VBD up/IN phrase, NP means noun phrase.
literature/NN late/JJ in/IN life/NN and/CC
whose/WP$ imaginative/JJ novels/NNS won/VBD Tagged input:
him/PRP a/DT following/VBG of/IN loyal/JJ . . .]
[In/IN an/DT almost/RB casual/JJ fashion/NN,/, the/DT
document/NN seems/VBZ to/TO confirm/VB two/CD
R3: Remove adjectives with idf <2.5 from the noun phrase if of/IN the/DT central/JJ charges/NNS of/IN the/DT feder-
al/JJ case/NN against/IN bin/NN Laden/NNP.]
the noun phrase contains more than 2 keywords. Here idf
means inverse document frequency (used in traditional infor-
After application of Rule R5:
mation settings). The DFA used for identifying noun phrases
[the/DT document/NN seems/VBZ to/TO con-
in shown in Fig-1.
firm/VB two/CD of/IN the/DT central/JJ
charges/NNS of/IN the/DT federal/JJ case/NN
against/IN bin/NN Laden/NNP.
3.2 Global Trimming
Adjective
Modifiers are used before or after the named entities
such as person name, organization name and location
name. The noun phrases also contain modifier terms
Start Noun
such as adjectives, adverbs. The modifier terms may re-
peat (partly or in-full) in a summary along with the units
(such as a named entity (NE) or a noun phrase (NP)) they
are modifying. The redundant information found in such
Article cases can be deleted with minimum loss of information
and without loss of grammaticality. We consider the
modifier terms as candidates for global trimming and the
Fig. 1 DFA for noun phrase identification
parts of modifiers can be eliminated if it is already found
in the previously selected sentences. Before applying
trimming rules, the modifier terms should be identified
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 7, JULY 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 181
carefully to maintain grammaticality. We apply different is greater than a threshold value (which is 0.5 in our set-
syntactic rules to identify modifier terms from the noun ting). Similarity between two modifiers M1 and M2 is cal-
phrases and from the surroundings of the named entities. culated as follows:
So, we have two types of global trimming: Named Entity
centric sentence trimming and sentence trimming by Sim(M1,M2)= (2*|M1 M2| ) / (|M1|+|M2|)
thinning noun phrases.
Articles and prepositions (if any) are removed while
Named Entity Centric Trimming measuring similarity between two modifiers.
Named Entity centric trimming has three steps: named
entity identification and identification of modifiers sur- Tagged input:
rounding named entities and Formation of trimming (Candidate segment for trimming is shown in bold
rules italics and the similar phrase found in the previously
A word with tag <NNP> (which is used by the tagger scanned sentences is shown in bold only)
to indicate proper noun) has been considered as a part of
a named entity and a sequence of words tagged with [Saudi/NNP exile/NN Osama/NNP bin/NN La-
<NNP> constitutes a named entity. For example, the den/NNP ,/, the/DT alleged/VBN mastermind/NN
phrase, “former Chilean dictator Augusto Pinochet”, is …]
tagged by the tagger as “former/JJ Chilean/JJ dicta- [ …newspaper/NN interview/NN of/IN Afghanis-
tor/NN Augusto/NNP Pinochet/NNP” and “Augus- tan/NNP -/: based/VBN Saudi/NNP billionaire/NN
to/NNP Pinochet/NNP” constitutes a named entity since Osama/NNP bin/NN Laden/NNP who/WP has/VBZ
it is a sequence of NNPs. been/VBN accused/VBN…]
In our work, we consider a noun phrase having named
entity (NE) at the head as a named entity phrase (NEP). After application of rule R6A:
An example of NEP is “former Chilean dictator Augusto
Pinochet”, where “Augusto Pinochet” is a named entity [...Saudi/NNP exile/NN Osama/NNP bin/NN La-
which is at the head of NEP. den/NNP ,/, the/DT alleged/VBN mastermind/NN ...]
We divide NEP into two parts as <modifier+ NE >. [.... interview/NN of/IN Laden/NNP who/WP
But, sometimes it may happen that a very common word has/VBZ been/VBN accused/VBN. . .]
(such as "President") appears as a part of NEP since the
tagger assigns the tag <NNP> to label it. To handle this In the above example, “Saudi/NNP exile/NN Osa-
situation, we convert the word to lowercase and check the ma/NNP bin/NN Laden/NNP” and “Afghanistan/NNP
idf value of this word. If the word is found in the vocabu- -/: based/VBN Saudi/NNP billionaire/NN Osama/NNP
lary and idf value is <2.5, we consider this word as a part bin/NN Laden/NNP” are the two sentence segments,
of the modifier, otherwise we consider it as a part of the which are basically named entity phrases (NEP). These
named entity. The procedure to identify a modifier and two NEPs contain the same named entity (assuming that
named entity works as follows: consecutive NNPs represents a named entity) “Laden” at
Say, A is the left most word and H is a head - word in their heads. So, our system divides first segment in to a
a NEP. Scan NEP from right to left checking NNPs and modifier “Saudi exile Osama bin” and a head named enti-
stop exactly when we encounter a non-NNP or a NNP ty “Laden” and second segment in to the modifier “Afg-
with idf value<2.5. Say, we stop at such a word X. Then hanistan based Saudi billionaire Osama bin” and a head
we consider the segment spanning from the word A to named entity “Laden”. According to similarity metric
the word X as modifier and the rest of the NEP is consi- mentioned above, similarity between two modifiers is
dered as a NE. >0.5. So, the modifier of the entity “Laden” in the second
A modifier of a named entity may appear in an anoth- sentence has been deleted according to the rule R6A.
er form which is called “noun in apposition” and this
form of modifier are extracted using syntactical patterns R6B: If any NP matches completely (word by word) with
<NE, M, > or <NE, M. >, where M is the modifier and NE the modifier of a NEP already seen in the previously scanned
is the named entity. Rule for named entity centric phrase sentences, replace the NP with NE head of the NEP.
trimming is as follows:
Tagged input:
R6A: Delete the modifier of a named entity in its current
mention in a sentence if the modifier of its current mention is [. . . the/DT trial/NN of/IN Malaysian/NNP for-
similar to one of modifiers of its early mentions in already- mer/JJ deputy/NN prime/JJ minister/NN An-
scanned sentences. war/NNP Ibrahim/NNP on/IN charges/NNS of/IN
corruption/NN. . .]
To apply the above rule, we maintain a list of modifi- [. . . his/PRP$ concerns/NNS about/IN the/DT ar-
ers for each mention of a named entity covered in the rest/NN of/IN Malaysian/NNP former/JJ deputy/NN
previously selected sentences while scanning a draft prime/JJ minister/NN ./.]
summary from top to bottom. Two modifiers are taken to
be similar when the term based similarity between them After application of Rule6B:
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 7, JULY 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 182
[. . . the/DT trial/NN of/IN Malaysian/NNP for- At/IN least/JJS 32/CD people/NNS were/VBD
mer/JJ deputy/NN prime/JJ minister/NN An- killed/VBN and/CC widespread/JJ flooding/NN
war/NN Ibrahim/NNP on/IN charges/NNS of/IN prompted/VBD more/JJR than/IN 150/CD,/,000/CD
corruption/NN. . .] to/TO seek/VB higher/JJR ground/NN ./.
[. . .his/PRP$ concerns/NNS about/IN the/DT ar-
rest/NN of/IN Anwar/NNP Ibrahim/NNP] After Application of the rule R7B:
At/IN least/JJS 32/CD were/VBD killed/VBN
and/CC widespread/JJ flooding/NN prompted/VBD
Simple NP Trimming more/JJR than/IN 150/CD,/, 000/CD to/TO seek/VB
We trim the noun phrases containing a named entity at higher/JJR ground/NN ./.
their heads using Named Entity centric trimming rules
discussed above. But, we treat differently with the noun
phrases having no named entity at their heads. In this
case, we consider the trailing non-noun words (adjectives 4 EXPERIMENTS AND RESULTS
and adverbs) of a noun phrase as modifier terms if the For comparing system-generated summaries to refer-
length of the NP is >2 and the distance between the word ence summaries, we use an automatic summary evalua-
and the noun head is >=2. The distance between two tion tool, ROUGE (Recall-Oriented Understudy for Gist-
phrasal words is measured by position of the head-word ing Evaluation, version 1.5.5) [23][24] developed by the
in the phrase minus position of a word in the phrase. Po- Information Science Institute at the University of South-
sition value of a word in a phrase increases from left to ern California. ROUGE is an automated tool, which com-
right of the phrase. The noun words satisfying above syn- pares a generated summary from an automated system
tactical constraints is considered as a modifier words if idf with one or more ideal summaries. The ideal summaries
value of the word is <2.5. The low idf value of the word are called models. ROUGE is based on n-gram overlap
signifies that the word is very common in the text corpus. between the system-produced and reference summaries.
The NP trimming rule is as follows. ROUGE was used in the 2004 and 2005 Document Under-
standing Conferences (DUC) (National Institute of Stan-
R7A: If A is a NP in a sentence S and Bi is one of NPs be- dards and Technology (NIST), 2005) as the evaluation
longing to the list of noun phrases found in the already-scanned tool. We consider ROUGE-1, ROUGE-2 and ROUGE-SU4
sentences and head (A)=head (Bi), delete the modifier words of average_F scores to measure the performance of each
A, which matches with that of Bi summarizer. ROUGE-1 evaluates unigram based overlap
between the system produced and reference summaries,
Tagged input: ROUGE-2 evaluates bigram co-occurrence while ROUGE-
(Candidate segment for trimming is shown in bold SU4 evaluates ‘‘skip bigrams,’’ which are pairs of words
italics and the similar phrase in the previously revised (in sentence order) having intervening word gaps no
sentence is shown in bold only) larger than 4 words1.
We chose as our input data the document sets used in
[ . . .to/TO launch/VB the/DT first/JJ component/NN the task2 in the Document Understanding Conference
of/IN a/DT multibillion/JJ dollar/NN internation- (DUC) in 2004. Task2 in DUC 2004 was designed to eva-
al/JJ space/NN station/NN after/IN a/DT year/NN luate short multi-document summary generation. This
of/IN delay/NN] collection contains 50 test document sets, each with ap-
[The/DT first/JJ part/NN of/IN the/DT internation- proximately 10 news stories. For each document set, four
al/JJ space/NN station/NN was/VBD smoothly/RB human-generated summaries are provided for the target
orbiting/VBG . . .] length of 665 bytes (approximately 100 words).
In our experiment, for each input data set, a draft summary
After application of Rule R7A: of 200 words is generated by the sentence extraction me-
thod discussed in section 2. Then, all the draft summaries
[. . . to/TO launch/VB the/DT first/JJ component/NN are tagged by the POS tagger. The trimming rules dis-
of/IN a/DT multibillion/JJ dollar/NN internation- cussed in section 3 are then applied one by one to the sen-
al/JJ space/NN station/NN after/IN a/DT year/NN tences in each draft summary. After trimming and resiz-
of/IN delay/NN] ing a draft summary for each document set, a final sum-
[The/DT first/JJ part/NN of/IN the/DT space/NN sta- mary of 665 bytes is selected from the trimmed draft
tion/NN was/VBD smoothly/RB orbiting/VBG . . .] summary. The results of the evaluation of the overall
summarization performance using ROUGE package (ver-
R7B: Trim noun phrases of the pattern <CD + X>, where sion 1.5.5) are shown in table 1, table 2 and table 3. We
CD is the tag for numeric value and X indicates common nouns show three of the ROUGE scores as our experimental results:
specifying human beings. We assume a predetermined list of ROUGE-1 (unigram-based), ROUGE-2 (bigram-based), and
words such as “people”, “persons”, ”soldiers” etc. for X. ROUGE-SU4 (skip bigram) metric. The summarization per-
formances before trimming and after trimming are shown
Tagged Input:
1 ROUGE-1.5.5 version with arguments: -n 2 -x -m -2 4

-u -b 665
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 7, JULY 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 183
separately in the tables in terms of ROUGE scores. be improved by using local and global trimming rules.
The proposed system has been compared with the We have used only syntactic trimming rules to eliminate
best system (peer65) and one baseline system (the lead less important or redundant constituents of the summary
baseline) participated in DUC 2004 on task 2. The Lead sentences.
baseline simply takes the first 665 bytes of the most recent The more improvement in the overall summarization
news article in a document cluster as a summary. performance can be possible by introducing new trim-
We can see from the table 1-3 that our system outperforms ming rules, which will be used to compress a draft sum-
the top performing system and the baseline system on task2 of mary without hampering the gramaticality and the fluen-
DUC 2004 over all three ROUGE metrics. cy of the summary. We hope that the trimming compo-
nent proposed in this paper may be used as a plug-in
TABLE 1 component at the output of any sentence extraction based
ROUGE-1 SYSTEMS COMPARISON ON DUC 2004 DATASET summarizer to boost up its summarization performance.

System ROUGE-1 95% conf. interval


Our system 0.38982 (0.38400-0.39583)
(Sentence Ex- References
traction
+Trimming) 1. P. B. Baxendale, “Man-made index for technical literature—An
experiment”, IBM Journal of Research and Development,
Sentence Ex- 0.37620 (0.37068-0.38182) 2(4):354–361, 1958.
traction 2. H. P. Edmundson, “New methods in automatic extracting”,
Peer 65 0.37928 (0.37381-0.38437 Journal of the Association for Computing Machinery,
16(2):264–285, 1969.
Lead baseline 0.32106 (0.31539-0.32648) 3. H. P. Luhn, “The automatic creation of literature abstracts”, IBM
Journal of Research Development, 2(2):159–165, 1958.
4. I. Mani and E. Bloedorn, “Summarizing Similarities and Differ-
ences among Related Documents”, Journal of Information re-
trieval, pp-35-67, 1999.
TABLE 2 5. J. G. Carbonell and J. Goldstein, “The use of MMR, diversity-
ROUGE-2 SYSTEMS COMPARISON ON DUC 2004 DATASET based re-ranking for reordering documents and producing sum-
maries”, In Proceedings of the 21st Annual International ACM
System
ROUGE-2 95% conf. interval SIGIR Conference on Research and Development in Information
Our system Retrieval, Melbourne, Australia, pages 335–336, 1998.
6. K. McKeown, J. Klavans, V. Hatzivassiloglou, R. Barzilay,
(Sentence Ex- 0.09641 (0.09227 -.10080) and E. Eskin, “Towards multi-document summarization by
traction reformulation: Progress and prospects”, In Proceedings of the
+Trimming) 16th National Conference of the American Association for
Artificial Intelligence (AAAI-1999), 18–22 July, pages 453–
Sentence Ex- 0.09179 (0.08779-0.09565)
460, 1999.
traction 7. D. Marcu, and L. Gerber, “An inquiry into the nature of multi-
Peer 65 0.09162 (0.08746-0.09581) document abstracts, extracts, and their evaluation”, In Proceed-
ings of the NAACL-2001 Workshop on Automatic Summariza-
tion, Pittsburgh, June. NAACL, pages 1–8, 2001.
Lead baseline 0.06386 (0.05985-0.06786)
8. D. R. Radev, H. Jing and M. Budzikowska, “Centroid-based
summarization of multiple documents: Sentence extraction, utili-
ty-based evaluation, and user studies”, In ANLP/NAACL
Workshop on Summarization, Seattle, April, 2000.
9. D. R. Radev, H. Jing, M. Sty and D. Tam, “Centroid-based
summarization of multiple documents”, Inf. Process. Manage-
TABLE 3 ment, 40(6): 919-938, 2004.
ROUGE-SU4 SYSTEMS COMPARISON ON DUC 2004 DATASET 10. R. Barzilay, K. McKeown and M. Elhadad, “Information fusion
System ROUGE-SU4 95% conf. interval in the context of multi-document summarization”, In Proceed-
ings of the 37th Annual Meeting of the Association for Computa-
Our system 0.14084 (0.13707-0.14468) tional Linguistics, College Park, MD, 20–26 June, pages 550–
(Sentence Ex- 557, 1999.
traction 11. R. Barzilay, M. Elhadad, and K. McKeown, “Sentence ordering
+Trimming) in multi-document summarization”, In Proceedings of the Hu-
Sentence Extrac- man Language Technology Conference, 2001.
0.13329 (0.12984-0.13681)
12. R. Barzilay, M. Elhadad, and K. McKeown, “Sentence ordering
tion in multi-document summarization”, In Proceedings of the Hu-
Peer 65 0.13227 (0.12883-0.13579) man Language Technology Conference, 2001.
0.10233 (0.09930-0.10536) 13. J. Otterbacher, D. R. Radev, and A. Lu, “Revisions that improve
Lead baseline cohesion in multi-document summaries: A preliminary study”,
In ACL Workshop on Text Summarization, Philadelphia, 2002.
14. I. Mani, G. Barbara and B. Eric, “Improving summaries by re-
vising them”, In Proceedings of the 37th Annual Meeting of the
Association for Computational Linguistics (ACL 99), College
5 CONCLUSION Park, MD, June, pages 558–565, 1999.
15. C. Lin,, “Improving Summarization Performance by Sentence
In this paper, we have shown that the multi-document Compression- A Pilot Study”, In the Proceedings of the Sixth In-
summaries produced by a sentence extraction method can
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 7, JULY 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 184
ternational Workshop on Information Retrieval with Asian Lan-
guage (IRAL 2003), Sapporo, Japan, July 7,2003

16. K. Knight and D. Marcu, “Statistics-Based summarization- Step


One: Sentence Compression”, In Proceedings of AAAI-2000,
Austin, TX, USA, 2000.
17. E. Hovy, C. Lin and L. Zhou, “A BE-based Multi-document
summarizer with sentence compression”, In Proceedings of Mul-
tilingual Summarization Evaluation (ACL 2005 workshop), Ann
Arbor, MI.
18. H. Liu, “MontyLingua: An end-to-end natural language proces-
sor with common sense”, Available at:
web.media.mit.edu/~hugo/montylingua, retrieved in 2004.
19. B. Dorr, D. Zajic and R. Schwartz, “Hedge trimmer: A parse-
and-trim approach to headline generation”, In Proceedings of the
HLT/NAACL 2003 Text Summarization Workshop and Docu-
ment Understand ing Conference (DUC 2003), pp. 1–8, Edmon-
ton, Alberta, 2003.
20. E. Hovy, J. Fukumoto, C.-Y. Lin and L. Zhou. 2005: basic Ele-
ments. http://www.isi.edu/~cyl/BE
21. K. Sarkar, “Improving Multi-document Test Summarization Per-
formance Using Local and Global Trimming”, In proceedings
of intelligent Human Computer Interaction,
,IIIT,Allahabad,India, pp- 272 –282, 2009
22. G.Erkan and D.R. Radev, “LexRank: Graph-based Lexical Cen-
trality as Salience in Text Summarization’, Journal of Artificial
Intelligence Research (JAIR)”, Volume 22, pages 457-479,
2004.
23. C.-Y. Lin and E. Hovy, “Automatic evaluation of summaries
using n-gram co-occurrence”, In Proceedings of 2003 Language
Technology Conference (HLT-NAACL 2003), Edmton, Canada,
May 27 - June 1, 2003.
24. C.Y. Lin, “ROUGE: A package for automatic evaluation of
summaries”, In WAS 2004: Proceedings of the Workshop on
Text Summarization Branches Out, July 25–26, Barcelona,
2004.

Kamal Sarkar received B.E degree in Computer Science and Engi-


neering from the Faculty of Engineering, Jadavpur University in
1996. He received the M.E degree from the same University in 1999.
From 2001, he is working as a reader in the Department of Comput-
er Science and Engineering, Jadavpur University. He is a member of
the Association of Computer Electronics and Electrical Engineers
(ACEEE). His research interests are in Text Summarization, Natural
Language Processing, Machine Learning, Web mining, knowledge
discovery from text data.

You might also like