Professional Documents
Culture Documents
* Corresponding author
Authorized licensed use limited to: Kwame Nkrumah Univ of Science and Technology. Downloaded on April 07,2024 at 07:42:15 UTC from IEEE Xplore. Restrictions apply.
2011 International Symposium on Intelligent Signal Processing and Communication Systems (lSPACS) December 7-9,2011
I I
arrows illustrate the motion of objects moving in the opposite
direction of a camera motion. The length of each arrow A Single
indicates the magnitude of motion vector. Fig. 3(a) illustrates
the global optical flow points upward while the camera is
panning down. Fig. 3(b) illustrates the optical flow points out
�0
to the boundary of the frame while the camera is zooming in.
Double
Fig. 3(c) illustrates the global optical flow points to the left
while the camera is panning to the right. All the vectors in
0
global optical flow in both cases point to the same direction
with the same magnitude.
B. Keyframe Generating F Sidebar
2) Panning: When the global optical flow of a shot 2) Wide Panel: obtained from a horizontal panning
sequence is pointing to one direction with a stable magnitude, keyframe. Panel A in Fig. 4 is an example of wide panel.
the shot sequence is said to be "panning". If the shot sequence 3) Semi-Wide Panel: obtained from a vertical panning
is panning, the algorithm generates a panorama keyframe. In keyframe with dimension wider than 11:9.
order to generate those keyframes, the first and the last frames 4) Square Panel: obtained from a vertical panning
must be intuitively selected from the shot sequence with keyframe with dimension narrower than 11:9 but not narrower
additional frames as needed. Once the first frame is selected, than a square. Panel B in Fig. 4 is an example of square panel.
the magnitude of a vector between two consecutive frames 5) Tall Panel: obtained from a vertical panning keyframe
starting from the first keyframe is determined. If the with dimension narrower than a square. Panel F in Fig. 4 is an
magnitude is greater than half of the frame width (or height) example of tall panel.
then the algorithm will select the latter as an additional frame.
The algorithm then uses the same criteria to recursively select It is regulated that a page contains four rows for
optimization, except the cases where a page contains no more
additional keyframes as necessary until the last frame of a shot
than one double arrangement. When a panel is single
sequence is reached. These selected keyframes are then arrangement, it consumes more spaces in a page, thus the
stitched together [10]. The keyframe generated from this type number of rows in a page is reduced to three in order to fit the
of shot sequence is a panoramic frame that covers every spaces.
element in the respective panning sequence.
With the definitions and initial settings mentioned earlier,
3) Zooming: This type of shot sequence occurs when most
the panel organizing is executed as described in Algorithm 1,
of the optical flows are surrounding around some specific
where P indicates all generated panels, and Pi is the ith panel
location, depicted objects are shrinking or enlarging. When and n is the number of all panels obtained from previous
zooming is detected, only one frame will be selected to sections. The algorithm is executed starting from the first panel
represent the shot sequence. In this case, the first frame will be PI in a collection and the value i is incremented along the
selected when the shot sequence is zooming in, and the last process. This algorithm is designed to organize the panels
frame will be selected when the shot sequence is zooming out. using general comics as an example, whereas it tries to
The frame that covers most contents is selected as a keyframe. optimize the page spaces as much as possible. It also orders the
panels so as to reduce confusion to the very least.
Authorized licensed use limited to: Kwame Nkrumah Univ of Science and Technology. Downloaded on April 07,2024 at 07:42:15 UTC from IEEE Xplore. Restrictions apply.
2011 International Symposium on Intelligent Signal Processing and Communication Systems (lSPACS) December 7-9,2011
panelOrganizing(P)
while i < n
if (isWide(Pi) ) II(isSemiwide(Pi) )
setSingle(Pi)
i=i+l ..
else if (isNormal(Pi)&& isNormal(Pi+l»
if(isNormal(Pi+2)&& isNormal(Pi+3»
setDouble(Pi,Pi+l)
(b)
setSingle(Pi+2)
i=i+3
else if(isTall(Pi+2»
setLeftsidebar(Pi+2) (a)
i=i+3
Figure 5. Comparison between (a) a shot sequence and (b) a stitched
else if(isSquare(Pi+2) I l isSquare (Pi+3» panorama keyframe.
setDouble(Pi,Pi+l)
setDouble(Pi+2,Pi+3)
i=i+4
else
setSingle(Pi,Pi+l,Pi+2)
i=i+3
else if(isNormal(Pi)
setDouble(Pi,Pi+l)
i=i+2
else if(isSquare(Pi»
setDouble(Pi,Pi+l)
(b)
i=i+2
else if(isTall(Pi»
if(isNormal(Pi+l)&& isNormal(Pi+2»
setRightsidebar(Pi)
(a)
i=i+3 Figure 6. An example of a distorted panorama keyframe. (a) The hair is
else if (isSquare(Pi+l) I l isTall(Pi+l» swaying while the camera is panning up (b) Panorama keyframe generated
from two frames with distorted points circled.
setDouble(Pi,Pi+l)
i=i+2
The sample videos are decoded into MPEG videos that
else
contain only chunks of frames without audio feature. Each
setSingle(Pi,Pi+l,Pi+2)
video is 20 minutes long and consists of 28,800 frames. There
i=i+3 are four major types of videos being samples for testing: high
else quality action, low-quality action, high-quality non-action, and
setSingle(Pi,Pi+l,Pi+2) low-quality non-action in which each type holds different
i=i+3 characteristics. The actions usually consist of more shot
transitions and dynamic shots compared to non-action, while
quality of the video determines how elaborate the frames are
IV. EXPERIMENTAL RESULTS AND DISCUSSION depicted. The high-quality video usually comes with detailed
The accuracy and performances of our proposed method motions that give better details to the resulted keyframes.
are evaluated on various types of cartoon animation videos. Fig. 5 shows how the proposed method passes three
Some limitations in keyframe generating step are also selected frames from the shot sequence as seen in Fig. 5(a) into
discussed in this section. The proposed method is evaluated the algorithm and generates a panorama keyframe that contains
based on contents accuracy using the comic books counterpart all important elements as seen in Fig. 5(b). It can be seen that a
of the samples. Note that the sample cartoon animation videos keyframe looks naturally stitched. The proposed keyframe
are chosen under the criteria that they have their own comic generating process gives a good result when a shot sequence is
book adaptation with the same contents, so that they can be panning with minimal amount of local optical flows; i.e.
used as an ideal model to compare with the results of the objects are not moving too much while the camera is panning.
proposed method. When some objects are moving along the panning scene, the
Authorized licensed use limited to: Kwame Nkrumah Univ of Science and Technology. Downloaded on April 07,2024 at 07:42:15 UTC from IEEE Xplore. Restrictions apply.
2011 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS) December 7-9,2011
stitched frames may generate a panorama keyframe that TABLE 1. GENERATED KEYFRAMES COMPARED WITH COMIC PANELS.
contains extra or distorted elements. Fig. 6 shows that the Ideal Panels Generated Panels
keyframe has some distortions on the girl's hair. This is Matched
Video Source % %
Panels Number Number
because the algorithm detects a panning shot sequence and Accuracy Accuracy
captures the fIrst and the last frames from the shot sequence, HQ-action 320 342 93.57 387 82.69
while the girl's hair is swaying. This panorama keyframe is LQ-action 349 366 95.36 432 80.79
HQ-non-action 232 245 94.69 255 90.98
classifIed as a distorted keyframe. LQ-non-action 214 224 95.54 271 78.97
Fig. 7 shows how the contents are exactly generated from
the proposed method compared with the comic books
adaptation, where the circled numbers represent the order of
each panels in the comic book and the order of organized
panels obtained from the proposed method. Note that Japanese
comic books are read from right to left and top to bottom, so do
the proposed method's organized panels. This comparison
shows that most of the comic strips generated from cartoon
animation by the proposed method have their contents matched
those of the respective comic adaptation. The order of scenes
and their contents are depicted and ordered accurately
according to their examples. Some frames even have their
aspect ratios altered by panorama process to match their
respective panels (pair no. 8).
The degree of accuracy is shown in Table I. The numbers in
the Matched Panels column denote the total number of
generated strip panels that have their contents matched exactly
with their respective comic adaptation panels. The panels are
considered matched when they share the same contents in
common sense. The numbers in the Ideal Panels column denote
the total number of panels featured in the comic books together
with the degree of accuracy when compared with the number
of matched panels. Likewise, the numbers in the Generated
Panels column denote the total number of strip panels
generated by the proposed method together with the degree of
accuracy when compared with the number of matched panels.
These results show that the degree of accuracy in Generated
Panels column is lower because the number of keyframes
generated along the process is always greater than the number
of comic panels. This is because when cartoons are animated, (a) (b)
they usually use several shot sequences to cover the contents
Figure 7. Comparison between (a) comic book version and (b) results from
that are able to be compressed into one panel in comic books.
the panel organizing algorithm.
Nevertheless, the degree of accuracy in Generated Panels
column is less signifIcant than in the Ideal Panels column The proposed method fIrstly marks the time code of a
because the main objective of this proposed method is to cover cartoon animation video based on shot boundary and optical
contents in the ideal model as much as possible. The surplus flow direction to separate periods of time into shot sequences.
panels aside from those contents usually add more details into Each shot sequence is then passed through the keyframe
it, and the proposed method manages to cover at least 90% of generating algorithm. By using the optical flow information,
the ideal contents in any cases. The comic strips are neatly the algorithm classifIes shot sequences and generates
generated, and its order is easy to follow without confusion. keyframes with various sizes. Each of these keyframes is then
treated as a comic strip panel and is organized together into
V. CONCLUSIONS
comic pages according to the proposed algorithm that intends
This paper proposes a novel method to generate comic to optimize space usage.
strips from cartoon animations. Concerning that the existing
fIxed-aspect-ratio video summarization methods [1], [2], [3], The results are examined by comparing the comic strips
[4], [5], [6], [7] lacks contents coverage, this paper proposes to obtained from the proposed method to their comic book
generate panorama keyframes with the aim to cover more adaptations. The results show that the generated comic strips
contents and to be more systematically represented. This manage to cover most of the contents featured in the comic
method is designed especially for video summarization of a books. Almost all of the panels have their contents exactly
cartoon animation. It exploits some characteristics unique from matched those in the ideal model which represent how the
other types of video media. results are supposed to be. The generated comic strips are also
easy to follow and comprehend by the users.
Authorized licensed use limited to: Kwame Nkrumah Univ of Science and Technology. Downloaded on April 07,2024 at 07:42:15 UTC from IEEE Xplore. Restrictions apply.
2011 International Symposium on Intelligent Signal Processing and Communication Systems (lSPACS) December 7-9,2011
Authorized licensed use limited to: Kwame Nkrumah Univ of Science and Technology. Downloaded on April 07,2024 at 07:42:15 UTC from IEEE Xplore. Restrictions apply.