You are on page 1of 42

USOO8718141B2

(12) United States Patent (10) Patent No.: US 8,718,141 B2


Kondo et al. (45) Date of Patent: May 6, 2014
(54) MOVING PICTURE CODING METHOD AND (58) Field of Classification Search
MOVING PICTURE DECODING METHOD CPC ...................................................... HO4N 7/362
FOR PERFORMING INTER PICTURE USPC ............. 375/240.01, 240.14, 240.15, 240.16,
PREDCTION CODING AND INTER PICTURE 375/240.24, 240.26
PREDCTION DECODING USING IPC ......................................................... HO4N 7/12
PREVIOUSLY PROCESSED PICTURES AS See application file for complete search history.
REFERENCE PICTURES (56) References Cited

(75) Inventors: Satoshi Kondo, Yawata (JP); Shinya U.S. PATENT DOCUMENTS
Kadono, Nishinomiya (JP); Makoto
Hagai, Moriguchi (JP); Kiyofumi Abe, 5,329,365 A T/1994 UZ
Kadoma (JP) 5,386,234 A 1/1995 Veltman et al.
(Continued)
(73) Assignee: Panasonic Corporation, Osaka (JP)
FOREIGN PATENT DOCUMENTS
(*) Notice: Subject to any disclaimer, the term of this CN 1136877 11, 1996
patent is extended or adjusted under 35 CN 1207,228 2, 1999
U.S.C. 154(b) by 876 days. (Continued)
(21) Appl. No.: 11/980,558 OTHER PUBLICATIONS
International Search Report issued Aug. 5, 2003 in the International
(22) Filed: Oct. 31, 2007 (PCT) Application No. PCT/JP03/04805.
(65) Prior Publication Data (Continued)
US 2008/0069232A1 Mar. 20, 2008 Primary Examiner — Young Lee
(74) Attorney, Agent, or Firm — Wenderoth, Lind & Ponack,
LLP.
Related U.S. Application Data (57) ABSTRACT
(62) Division of application No. 10/468,119, filed as A coding control unit (110) and a mode selection unit (109)
application No. PCT/JP03/02099 on Feb. 26, 2003, are included. The coding control unit (110) determines the
now Pat. No. 7,664,180. coding order for a plurality of consecutive B-pictures located
between I-pictures and P-pictures so that the B-picture whose
(30) Foreign Application Priority Data temporal distance from two previously coded pictures is far
thest in display order is coded by priority, so as to reorder the
Mar. 4, 2002 (JP) ................................. 2002-056919 B-pictures in coding order. When a current block is coded in
Apr. 19, 2002 (JP) ................................. 2002-118598 direct mode, the mode selection unit 109 scales a forward
Jul. 2, 2002 (JP) ................................. 2002-193027 motion vector of a block which is included in a backward
reference picture of a current picture and co-located with the
(51) Int. Cl. current block, so as to generate motion vectors of the current
H04N 7/2 (2006.01) block, if the forward motion vector has been used for coding
(52) U.S. Cl. the co-located block.
USPC ..................................................... 375/24O16 4 Claims, 21 Drawing Sheets

TRD
-e-S-> Motion
Motion Vector C
Vector d

Block b Block i
Vectore
P5 B6
(MVB)
B7 B8 P9
US 8,718,141 B2
Page 2

(56) References Cited T. Weigand, “H.26L Test Model Long-Term No. 9 (TML-9) DrafiO”.
ITU-T Telecommunication Standardization Sector of ITU, Geneva,
U.S. PATENT DOCUMENTS CH, Dec. 21, 2001, pp. 1, 3-75, (XP001086625).
Office Action issued Apr. 15, 2009 in U.S. Appl. No. 1 1/976,750.
5,410,354 A 4, 1995 UZ Office Action issued Mar. 31, 2009 in U.S. Appl. No. 1 1/980,557.
5,724,446 A 3, 1998 Liu et al. Satoshi Kondo et al., “Proposal for Minor Changes to Multi-Frame
5,809,173 A 9, 1998 Liu et al. Buffering Syntax for Improving Coding Efficiency of B-Pictures”.
5,886,742 A 3, 1999 Hibi et al. ISO/IEC JTC1/SC29/WG 11 and ITU-TSG 16 Q.6, Jan. 29, 2002, pp.
6,097,842 A 8, 2000 Suzuki et al. 1-10, XPO02249662.
6,108,449 A 8/2000 Sekiguchi et al. M. Flierl et al., “A Locally Optimal Design Algorithm for Block
6,205,177 B1 3, 2001 Girodet al. Based Multi-Hypothesis Motion-Compensated Prediction”. Data
6,389,173 B1 5, 2002 Suzuki et al. Compression Conference, 1998. DCC '98. Proceedings Snowbird,
6,396,874 B1 5, 2002 Kato UT, USA Mar. 30-Apr. 1, 1998, Los Alamitos, CA, USA, IEEE
6.427,027 B1 7/2002 Suzuki et al. Comput. Soc. US, Mar. 30, 1998, pp. 239-248, XP010276624.
6,459,812 B2 10/2002 Suzuki et al. Limin Wang, et al. "JVT-B07 Ir2 Adaptive rame/Field Coding for
6,611,558 B1 8/2003 Yokoyama JVT, JVT of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC) JTC1/
6,738,980 B2 5, 2004 Lin et al. SC29/WG 11 and ITU-T SG16 Q.6), Jan. 29, 2002, pp. 1-24;
RE38,563 E 8/2004 Eifriget al. XPOO2377315.
6,807.231 B1 10/2004 Wiegand et al. Michael Gallant and Guy Cote, VCEG-N84 High Rate, High Reso
6,980,596 B2 12/2005 Wang et al. lution Video Using H26L, ITU-T. VCEG, Study Group 16 Question 6.
2001/0014178 A1 8, 2001 Boon Sep. 24, 2001, pp. 1-7, XP002376024.
2001/0040700 A1 11/2001 Hannuksela et al. T. Wiegand, “Test Model Long-Term No. 9 (TML-9) DrafiO, ITU-T
2002/0001411 A1 1/2002 Suzuki et al.
2002fOO12523 A1 1/2002 Nakatani Telecommunication Standardization Sector of ITU, Geneva, CH,
2005/011 1550 A1 5/2005 Wang et al. Dec. 21, 2001, pp. 1, 3-75, XP001086625.
2006, OO72662 A1 4/2006 Tourapis et al. International Search Report issued Jun. 13, 2003 in International
Application PCT/JP03/02099.
FOREIGN PATENT DOCUMENTS Supplementary Partial European Search Reportissued Jun. 2, 2005 in
European Patent Application EP 03707082.
EP O 542 195 5, 1993 Supplementary European Search Report issued Apr. 19, 2006 in
EP O 863 674 9, 1998 European Patent Application EP 03707082.
EP 1 406 453 4/2004 Office Action issued Jun. 18, 2008 in the European Patent Applica
JP 4-20088 1, 1992 tion No. 03707082.8.
JP 4-245790 9, 1992 H. 26L Test Model Long Term No. 6(TML-6) draft 0. Online),
JP 5-137131 6, 1993 ITU-Telecommunications Standardization Sector Study Group 16
JP 6-62391 3, 1994 Video Coding Experts Group (VCEG), 2001. retrieved on May 28,
JP 9-65.342 3, 1997 2003), pp. 28-33. Retrieved from the Internet:<URL:http://kbs.cs.itu
JP 10-126787 5, 1998 berlin.de/stewe?vceg?TMLDocs/VCEG-L45d0.doc>.
JP 11-239353 8, 1999 U.S. Office Action issued Feb. 9, 2010 in corresponding U.S. Appl.
JP 2001-25019 1, 2001 No. 1 1/980,556.
JP 2001-45475 2, 2001
JP 2001-45498 2, 2001 Wiegand et al., Working Draft No. 2, Revision 8, Jan. 29, 2002,
JP 2001-224036 8, 2001 JVT-of ISO/IEC MPEG and ITU-T VCEG, Sec. 114.2.
JP 2001-268581 9, 2001 U.S. Patent Office Action dated Jan. 31, 2008 in corresponding U.S.
KR 1999-23O89 3, 1999 Appl. No. 10/468,119.
WO 98.105.93 3, 1998 European Search Report issued Jul. 7, 2011 in corresponding Euro
WO 98.59496 12/1998 pean Patent Application No. 10181711.2.
WO O3/047 271 6, 2003 European Search Report issued Jul. 7, 2011 in corresponding Euro
WO O3/047 272 6, 2003 pean Patent Application No. 10182775.6.
European Search Report issued Jul. 7, 2011 in corresponding Euro
OTHER PUBLICATIONS pean Patent Application No. 10182783.0.
European Search Report issued Jul. 7, 2011 in corresponding Euro
Supplementary European Search Report issued Apr. 18, 2006 in pean Patent Application No. 10182789.7.
European Application No. EP 03725587. Kondo, Satoshi, et al. “New Prediction Method to Improve B-picture
Office Action issued Sep. 19, 2007 in U.S. Appl. No. 10/480,928. Coding Efficiency'. Document VCEG-026, ITU Telecommunica
Office Action issued Apr. 29, 2008 in U.S. Appl. No. 10/480,928. tions Standardization Sector, Study Group 16, Question 6, Dec. 4.
Office Action issued Sep. 18, 2008 in U.S. Appl. No. 1 1/976,750. 2001, pp. 1-11, XP002249660, paragraph “2.3. Direct Mode”.
Wang, Limin, et al. "Adaptive Frame/Field Coding for JVT Video
Office Action issued Nov. 21, 2008 in U.S. Appl. No. 10/480,928. Coding”, Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T
Office Action issued Apr. 28, 2009 in U.S. Appl. No. 10/480,928. VCEG (ISO/IEC JTC1/SC29/WG 11 and ITU-T SG16 Q.6), Docu
Thomas Wiegand, “Working Drafi No. 2, Revision 2 (WD-2). Docu ment JVT-B071, Feb. 1, 2002, pp. 1-24, XP030005071, paragraph
ment JVT-B118R2, Mar. 15, 2003, pp. 1-106 (XP00224.5569). 3.4 Direct Mode”.
Satoshi Kondo et al., “Proposal for Minor Changes to Multi-Frame Kondo, Satoshi, et al. “New Prediction Method to Improve B-picture
Buffering Syntax for Improving Coding Efficiency of B-Pictures”. Coding Efficiency'. Document VCEG-026, ITU Telecommunica
ISO/IEC JTC1/SC29/WG 11 and ITU-TSG 16 Q.6, XX, XX, Jan. 29, tions Standardization Sector, Study Group 16, Question 6, Dec. 4.
2002, pp. 1-10, (XP002249662). 2001, pp. 1-8, XP002249661, section 2.3.
U.S. Patent May 6, 2014 Sheet 1 of 21 US 8,718,141 B2

18
|


©

|"||||951||LIZ789g18d.9|81

YJHLOYTH?Vd
U.S. Patent May 6, 2014 Sheet 2 of 21 US 8,718,141 B2

PRIOR ART

Fig. 2

Motion
Vector d
Block b
U.S. Patent May 6, 2014 Sheet 3 of 21 US 8,718,141 B2
U.S. Patent US 8,718,141 B2
U.S. Patent May 6, 2014 Sheet 6 of 21 US 8,718,141 B2

9u ?_L

\(^)^
||" | | ÆSTN)
g9(61-)
U.S. Patent May 6, 2014 Sheet 7 of 21 US 8,718,141 B2

Motion
Vector C
Fig. 7A (MV)

Blockb
Motion
Vector e
P5 B6 B7 B8
(MVB) P9
TRD
- TRE > Motion
Motion Vector C
vector d (MV)
(MVF)
Block b Block i
Fig. 7B Motion
Vector e
(MVB)
P5 B6 B7 B8 P9
TRF TRD
Mign - IRB >
vectorg Motion
(MVF) Ns f
Fig. 7C
Motion
Vector h
(MVB)
P5 B6 B7 B8 P9
TRD
TRF
Motion
Vector f
(MV)

Motion
Vector g
P5 B6 B7 B8 (MVF) Pg
U.S. Patent May 6, 2014 Sheet 8 of 21 US 8,718,141 B2

Motion Motion
vector d Vector e
(MVF) (MVB)
Fig. 8A Block b

P5 B6 B7 B8 P9
TRD
TRF
TRB
Motion
Vectorg
Motion (MV)
Vector h
(MVF)
U Motion Block f
Fig. 8B Vector i
(MVB)
P5 B6 B7 B8 P9
TRD
< TRE >
Motion
vectorg
Motion (MV)
Vector h
(MVF) B
O Motion OCk f
Fig. 8C Vector i
(MVB)
P5 B6 B7 B8 P9
TRD
< TRF -
TRB
Motion
Vector C
(MV)
Fig. 8D Motion Block b
Vector e
(MVB)
P5 B6 B7 B8 P9
U.S. Patent May 6, 2014 Sheet 10 of 21 US 8,718,141 B2

~^_
VOL'61)
||" 1 | | I

GOL'6|-
©
U.S. Patent May 6, 2014 Sheet 11 of 21 US 8,718 141 B2

\·~^/
| ?I 613
||
g||
U.S. Patent May 6, 2014 Sheet 12 of 21 US 8,718,141 B2

ld86G
lI

p|u?OA3IS
U.S. Patent May 6, 2014 US 8,718,141 B2

18
|

-61-I
|
9
U.S. Patent May 6, 2014 Sheet 14 of 21 US 8,718,141 B2

ld
8

18
2

18
|
N

-61-I
|
#7
Zd

88
28 N
LI
U.S. Patent May 6, 2014 Sheet 15 of 21 US 8,718,141 B2

ld
8

18
2

|18<–69 |×
N. O18

Zd

lI
982
! G8<–98 _°N #78
No..
!”
U.S. Patent May 6, 2014 Sheet 16 of 21 US 8,718,141 B2
U.S. Patent May 6, 2014 Sheet 17 of 21 US 8,718,141 B2
U.S. Patent US 8,718,141 B2

eit=ael qow
el
?uou?
Xe
|
gl

00LX3
U.S. Patent May 6, 2014 Sheet 19 of 21 US 8,718,141 B2

ex208

ex203

ex204

ex205
ex1 15
US 8,718,141 B2
1. 2
MOVING PICTURE CODING METHOD AND manner, the B-pictures B6, B7 and B8 are coded using the
MOVING PICTURE DECODING METHOD P-pictures P5 and P9 respectively as reference pictures, and
FOR PERFORMING INTER PICTURE the B-pictures B10, B11 and B12 are coded using the P-pic
PREDCTION CODING AND INTER PICTURE tures P9 and P13 respectively as reference pictures.
PREDICTION DECODING USING In the above-mentioned coding, the reference pictures are
PREVIOUSLY PROCESSED PICTURES AS coded prior to the pictures which refer to the reference pic
REFERENCE PICTURES tures. Therefore, the bit stream is generated by the above
coding in the sequence as shown in FIG. 1B.
This application is a divisional application of application By the way, in the H.264 moving picture coding method, a
10
Ser. No. 10/468,119, filed Aug. 15, 2003, now U.S. Pat. No. coding mode called direct mode can be selected. An inter
7,664, 180 which is the National Stage of International Appli picture prediction method in direct mode will be explained
cation No. PCT/JP03/02099, filed Feb. 26, 2003. with reference to FIG. 2. FIG. 2 is an illustration showing
TECHNICAL FIELD motion vectors in direct mode, and particularly showing the
15 case of coding a blocka in the picture B6 in direct mode. In
The present invention relates to moving picture coding this case, a motion vector c used for coding a block b in the
methods and moving picture decoding methods, and particu picture P9 is utilized. The block b is co-located with the block
larly to methods for performing inter picture prediction cod a and the picture P9 is a backward reference picture of the
ing and inter picture prediction decoding of a current picture picture B6. The motion vector c is a vector used for coding the
using previously processed pictures as reference pictures. block band refers to the picture P5. The blocka is coded using
bi-prediction based on the reference blocks obtained from the
BACKGROUND ART forward reference picture P5 and the backward reference
picture P9 using vectors parallel to the motion vector c. In
In moving picture coding, data amount is generally com other words, the motion vectors used for coding the block a
pressed by utilizing the spatial and temporal redundancies 25 are the motion vector d for the picture P5 and the motion
that exist within a moving picture. Generally speaking, fre vectore for the picture P9.
quency transformation is used as a method utilizing the spa However, when B-pictures are coded using inter picture
tial redundancies, and inter picture prediction coding is used prediction with reference to I and P-pictures, the temporal
as a method utilizing the temporal redundancies. In the inter distance between the current B-picture and the reference pic
picture prediction coding, for coding a current picture, previ 30 ture may be long, which causes reduction of coding effi
ously coded pictures earlier or later than the current picture in ciency. Particularly when a lot of B-pictures are located
display order are used as reference pictures. The amount of between adjacent I-picture and P-picture or two P-pictures
motion of the current picture from the reference picture is closest to each other, coding efficiency is significantly
estimated, and the difference between the picture data reduced.
obtained by motion compensation based on that amount of 35 The present invention has been conceived in order to solve
motion and the picture data of the current picture is calcu the above-mentioned problem, and it is an object of the
lated, so that the temporal redundancies are eliminated. The present invention to provide a moving picture coding method
spatial redundancies are further eliminated from this differ and a moving picture decoding method for avoiding effi
ential value so as to compress the data amount of the current ciency reduction of coding B-pictures if a lot of B-pictures are
picture. 40 located between an I-picture and a P-picture or between two
In the moving picture coding method called H.264 which P-pictures. In addition, it is another object to provide a mov
has been developed for standardization, a picture which is ing picture coding method and a moving picture decoding
coded not using interpicture prediction but using intra picture method for improving coding efficiency in direct mode.
coding is referred to as an I-picture, a picture which is coded
using interpicture prediction with reference to one previously 45 DISCLOSURE OF INVENTION
processed picture which is earlier or later than a current
picture in display order is referred to as a P-picture, and a In order to achieve above-mentioned object, the moving
picture which is coded using inter picture prediction with picture coding method of the present invention is a moving
reference to two previously processed pictures which are picture coding method for coding picture data corresponding
earlier or later than a current picture in display order is 50 to pictures that form a moving picture and generating a bit
referred to as a B-picture (See ISO/IEC 14496-2 “Informa stream, the moving picture coding method comprising: a
tion technology—Coding of audio-visual objects—Part 2: coding step for coding a current picture as one of an I-picture,
Visual” pp. 218-219). a P-picture and a B-picture, the I-picture having only blocks
FIG. 1A is a diagram showing relationship between respec which are intra picture coded, the P-picture having a block
tive pictures and the corresponding reference pictures in the 55 which is inter picture prediction coded with uni-predictive
above-mentioned moving picture coding method, and FIG. reference using a previously coded picture as a first reference
1B is a diagram showing the sequence of the pictures in the bit picture, and the B-picture having a block which is inter pic
stream generated by coding. ture prediction coded with bi-predictive reference using pre
A picture I1 is an I-picture, pictures P5, P9 and P13 are viously coded pictures as a first reference picture and a second
P-pictures, and pictures B2, B3, B4, B6, B7, B8, B10, B11 60 reference picture, wherein the coding step includes a control
and B12 are B-pictures. As shown by the arrows, the P-pic step for determining coding order which is different from
tures P5, P9 and P13 are coded using inter picture prediction display order for consecutive B-pictures located between
from the I-picture I1 and P-pictures P5 and P9 respectively as I-pictures and P-pictures.
reference pictures. Therefore, since B-pictures can be coded using pictures
As shown by the arrows, the B-pictures B2, B3 and B4 are 65 which are temporally closer in display order as reference
coded using inter picture prediction from the I-picture I1 and pictures, prediction efficiency for motion compensation is
P-picture P5 respectively as reference pictures. In the same improved and thus coding efficiency can be increased.
US 8,718,141 B2
3 4
Also, the moving picture coding method according to the based on a second reference picture of the block B, using a
present invention is a moving picture coding method for difference specified by information indicating display order
coding picture data corresponding to pictures that form a of pictures.
moving picture and generating a bit stream, the moving pic Therefore, when the direct mode is selected, if a second
ture coding method comprising: a coding step for coding a reference picture has a first motion vector, this first motion
current picture as a B-picture having a block which is inter vector is scaled, and if the second reference picture does not
picture prediction coded with bi-predictive reference using have a first motion vector but only a second motion vector,
previously coded pictures as a first reference picture and a this second motion vectoris Scaled. So, there is no need to add
second reference picture, wherein in the coding step, when a motion vector information to a bit stream, and prediction
current block A in a current B-picture is coded in direct mode 10 efficiency can be improved.
by which motion compensation of the current block A is In addition, the moving picture decoding method accord
performed using motion vectors of the current block A ing to the present invention is a moving picture decoding
obtained from a motion vector of a previously coded block, method for decoding a bit stream which is generated by
the motion vectors for performing the motion compensation coding picture data corresponding to pictures that form a
of the current block A are obtained by scaling a first motion 15 moving picture, the moving picture decoding method com
vector, based on a first reference picture, of a co-located block prising: a decoding step for decoding a current picture by inter
B in the second reference picture of the current block A, using picture prediction using a previously decoded picture as a
a difference specified by information indicating display order reference picture, wherein in the decoding step, when the
of pictures. current picture is decoded by the inter picture prediction with
Therefore, when the direct mode is selected, since a first bi-predictive reference using the previously decoded pictures
motion vector of a second reference picture is scaled, there is as a first reference picture and a second reference picture, a bit
no need to add motion vector information to a bit stream, and stream including at least a picture which is temporally closest
prediction efficiency can also be improved. to the current picture in display order, as the first reference
Likewise, when a current block A in a current B-picture is picture or the second reference picture, is decoded.
coded in direct mode, the motion vectors for performing the 25 Therefore, a bit stream, which is generated by coding a
motion compensation of the current block A may be obtained picture by inter picture prediction with bi-predictive refer
by Scaling a second motion vector, based on a second refer ence using pictures which are temporally close in display
ence picture, of a co-located block B in the second reference order as a first reference picture and a second reference pic
picture of the current block A, using a difference specified by ture, can be properly decoded.
information indicating display order of pictures. 30 Also, the moving picture decoding method according to the
Therefore, when the direct mode is selected, since a second present invention is a moving picture decoding method for
motion vector of a second reference picture is scaled, there is decoding a bit stream which is generated by coding picture
no need to add motion vector information to a bit stream, and data corresponding to pictures that form a moving picture, the
prediction efficiency can also be improved. moving picture decoding method comprising: a decoding
Furthermore, when a current block Aina current B-picture 35 step for decoding a current picture by inter picture prediction
is coded in direct mode, if a co-located block B in the second using a previously decoded picture as a reference picture,
reference picture of the current block A is previously coded in wherein in the decoding step, when the current picture is a
direct mode, the motion vectors for performing the motion picture having a block which is decoded by inter picture
compensation of the current block A may be obtained by prediction with bi-predictive reference using previously
Scaling a first motion vector, based on a first reference picture 40 decoded pictures as a first reference picture and a second
of the block B, substantially used for coding the block B in the reference picture, and a current block A is decoded in direct
second reference picture, using a difference specified by mode by which motion compensation of the current block A
information indicating display order of pictures. is performed using motion vectors of the current block A
Therefore, when the direct mode is selected, since a first obtained from a motion vector of a previously decoded block,
motion vector of a second reference picture which has been 45 the motion vectors for performing the motion compensation
Substantially used for coding the second reference picture is of the current block A are obtained by scaling a first motion
scaled, there is no need to add motion vector information to a vector, based on a first reference picture, of a co-located block
bit stream, and prediction efficiency can also be improved. B in the second reference picture of the current block A, using
Also, when a current block Aina current B-picture is coded a difference specified by information indicating display order
in direct mode, the motion vectors for performing the motion 50 of pictures.
compensation of the current block A may be obtained by Therefore, when the direct mode is selected, since a first
Scaling a first motion vector, based on a first reference picture, motion vector of a second reference picture is scaled, proper
of a co-located block B in a temporally later P-picture, using decoding can be achieved.
a difference specified by information indicating display order Likewise, when a current picture is a picture having a block
of pictures. 55 which is decoded by inter picture prediction with bi-predic
Therefore, when the direct mode is selected, since a first tive reference and a current block A is decoded in direct mode,
motion vector of a temporally later P-picture is scaled, there the motion vectors for performing the motion compensation
is no need to add motion vector information to a bit stream, of the current block A may be obtained by Scaling a second
and prediction efficiency can also be improved. motion vector, based on a second reference picture, of a
Furthermore, when a current block Aina current B-picture 60 co-located block B in the second reference picture of the
is coded in direct mode, the motion vectors for performing the current block A, using a difference specified by information
motion compensation of the current block A may be obtained indicating display order of pictures.
by scaling a first motion vector if a co-located block B in the Therefore, when the direct mode is selected, since a second
second reference picture of the current block A is coded using motion vector of a second reference picture is scaled, proper
at least the first motion vector based on a first reference 65 decoding can be achieved.
picture of the block B, and Scaling a second motion vector if Furthermore, when a current picture is a picture having a
the block B is coded using only the second motion vector block which is decoded by inter picture prediction with bi
US 8,718,141 B2
5 6
predictive reference and a current block A is decoded in direct current blocka is a picture B6, FIG.7C shows a third example
mode, if a co-located block B in the second reference picture in a case where a current blocka is a picture B6, and FIG. 7D
of the current block A is previously decoded in direct mode, shows a fourth example in a case where a current blocka is a
the motion vectors for performing the motion compensation picture B6.
of the current block A may be obtained by scaling a first FIG. 8 is a schematic diagram showing motion vectors in
motion vector, based on a first reference picture of the block direct mode in the embodiments of the present invention, and
B, substantially used for decoding the block B in the second FIG. 8A shows a fifth example in a case where a current block
reference picture, using a difference specified by information a is a picture B6, FIG. 8B shows a sixth example in a case
indicating display order of pictures. where a current block a is a picture B6, FIG. 8C shows a
Therefore, when the direct mode is selected, since a first 10
seventh example in a case where a current blocka is a picture
motion vector of a second reference picture which has been B6, and FIG. 8D shows a case where a current blocka is a
Substantially used for decoding the second reference picture picture B8.
is scaled, proper decoding can be achieved. FIG. 9 is a schematic diagram showing prediction relations
Also, when a current picture is a picture having a block between respective pictures and their sequence in the embodi
which is decoded by inter picture prediction with bi-predic 15
tive reference and a current block A is decoded in direct mode, ments of the present invention, and FIG. 9A shows the pre
the motion vectors for performing the motion compensation diction relations between respective pictures indicated in dis
of the current block A may be obtained by scaling a first play order, and FIG.9B shows the sequence of the pictures
motion vector, based on a first reference picture, of a co reordered in coding order (in a bit stream).
located block B in a temporally later picture, using a differ FIG. 10 is a schematic diagram showing prediction rela
ence specified by information indicating display order of tions between respective pictures and their sequence in the
pictures, the later picture being inter picture prediction embodiments of the present invention, and FIG. 10A shows
decoded with uni-predictive reference using a previously the prediction relations between respective pictures indicated
decoded picture as a first reference picture. in display order, and FIG. 10B shows the sequence of the
Therefore, when the direct mode is selected, since a first 25 pictures reordered in coding order (in a bit stream).
motion vector of a picture which is decoded by inter picture FIG. 11 is a schematic diagram showing prediction rela
prediction with uni-predictive reference is scaled, proper tions between respective pictures and their sequence in the
decoding can be achieved. embodiments of the present invention, and FIG. 10A shows
The present invention can be realized as Such a moving the prediction relations between respective pictures indicated
picture coding method and a moving picture decoding 30
in display order, and FIG. 10B shows the sequence of the
method as mentioned above, but also as a moving picture pictures reordered in coding order (in a bit stream).
coding apparatus and a moving picture decoding apparatus FIG. 12 is a schematic diagram showing hierarchically the
including characteristic steps of these moving picture coding picture prediction structure as shown in FIG. 6 in the embodi
method and moving picture decoding method. In addition, the ments of the present invention.
present invention can be realized as a bit stream obtained by 35
FIG. 13 is a schematic diagram showing hierarchically the
coding by the moving picture coding method so as to distrib
ute it via a recording medium such as a CD-ROM or a trans picture prediction structure as shown in FIG.9 in the embodi
mission medium Such as the Internet. ments of the present invention.
FIG. 14 is a schematic diagram showing hierarchically the
BRIEF DESCRIPTION OF DRAWINGS 40 picture prediction structure as shown in FIG. 10 in the
embodiments of the present invention.
FIG. 1 is a schematic diagram showing prediction relations FIG. 15 is a schematic diagram showing hierarchically the
between pictures and their sequence in the conventional mov picture prediction structure as shown in FIG. 11 in the
ing picture coding method, and 1A shows the relations embodiments of the present invention.
between respective pictures and the corresponding reference 45 FIG. 16 is a block diagram showing the structure of an
pictures, and FIG. 1B shows the sequence of the pictures in a embodiment of a moving picture decoding apparatus using a
bit stream generated by coding. moving picture decoding method according to the present
FIG. 2 is a schematic diagram showing motion vectors in invention.
direct mode in the conventional moving picture coding FIG. 17 is an illustration of a recording medium for storing
method. 50
a program for realizing the moving picture coding method
FIG. 3 is a block diagram showing the structure of a first and the moving picture decoding method in the first and
embodiment of a moving picture coding apparatus using a second embodiments by a computer system, and FIG. 17A
moving picture coding method according to the present shows an example of a physical format of a flexible disk as a
invention.
FIG. 4 is an illustration of picture numbers and relative 55 body of recording medium, FIG. 17B shows a cross-sectional
indices in the embodiments of the present invention. view and a front view of the appearance of the flexible disk
FIG. 5 is a conceptual illustration of a moving picture and the flexible disk itself, FIG. 17C shows a structure for
coded data format in the moving picture coding apparatus in recording and reproducing the program on the flexible disk
the embodiments of the present invention. FD.
FIG. 6 is an illustration showing the picture sequence in a 60 FIG. 18 a block diagram showing the overall configuration
reordering memory in the embodiments of the present inven of a content Supply system for realizing content distribution
tion, and FIG. 6A shows the sequence in input order, and FIG. service.
6B shows the reordered sequence. FIG. 19 is a sketch showing an example of a mobile phone.
FIG. 7 is a schematic diagram showing motion vectors in FIG.20 is a block diagram showing the internal structure of
direct mode in the embodiments of the present invention, and 65 the mobile phone.
FIG. 7A shows a case where a current blocka is a picture B7. FIG. 21 is a block diagram showing the overall configura
FIG. 7B shows first and second examples in a case where a tion of a digital broadcast system.
US 8,718,141 B2
7 8
BEST MODE FOR CARRYING OUT THE data"Block1 for direct mode, block coded data"Block2 for
INVENTION the inter picture prediction other than the direct mode, and the
like. The block coded data “Block2 for the inter picture
The embodiments of the present invention will be prediction other than direct mode has a first relative index
explained below with reference to the figures. “RIdx1 and a second relative index “RIdx2 for indicating
two reference pictures used for inter picture prediction, a first
First Embodiment motion vector “MV1' and a second motion vector "MV2' in
this order. On the other hand, the block coded data “Block1
FIG. 3 is a block diagram showing the structure of an for direct mode does not have the first and second relative
embodiment of the moving picture coding apparatus using 10 indices “RIdx1' and "RIdx2' and the first and second motion
the moving picture coding method according to the present vectors “MV1 and “MV2. The index which is to be used,
invention. the first relative index "RIdx1' or the second relative index
As shown in FIG. 3, the moving picture coding apparatus “RIdx2', can be determined by the prediction type “Pred
includes a reordering memory 101, a difference calculation Type'. Also, the first relative index “RIdx1 indicates a first
unit 102, a residual error coding unit 103, a bit stream gen 15 reference picture, and the second relative index “RIdx2' indi
eration unit 104, a residual error decoding unit 105, an addi cates a second reference picture. In other words, whether a
tion unit 106, a reference picture memory 107, a motion picture is a first reference picture or a second reference picture
vector estimation unit 108, a mode selection unit 109, a cod is determined based on where they are located in the bit
ing control unit 110, switches 111-115 and a motion vector Stream.
storage unit 116. Note that a P-picture is coded by inter picture prediction
The reordering memory 101 Stores moving pictures input with uni-predictive reference using a previously coded pic
ted on a picture-to-picture basis in display order. The coding ture which is located earlier or later in display order as a first
control unit 110 reorders the pictures stored in the reordering reference picture, and a B-picture is coded by inter picture
memory 101 in coding order. The coding control unit 110 also prediction with bi-predictive reference using previously
controls the operation of the motion vector storage unit 116 25 coded pictures which are located earlier or later in display
for storing motion vectors. order as a first reference picture and a second reference pic
Using the previously coded and decoded picture data as a ture. In the first embodiment, the first reference picture is
reference picture, the motion vector estimation unit 108 esti explained as a forward reference picture, and the second
mates a motion vectorindicatingaposition which is predicted reference picture is explained as a backward reference pic
optimum in the search area in the reference picture. The mode 30 ture. Furthermore, the first and second motion vectors for the
selection unit 109 determines a mode for coding macroblocks first and second reference pictures are explained as a forward
using the motion vector estimated by the motion vector esti motion vector and a backward motion vector respectively.
mation unit 108, and generates predictive image databased on Next, how to assign the first and second relative indices will
the coding mode. The difference calculation unit 102 calcu be explained with reference to FIG. 4A.
lates the difference between the image data read out from the 35 As the first relative indices, in the information indicating
reordering memory 101 and the predictive image data input display order, the values incremented by 1 from 0 are first
ted by the mode selection unit 109, and generates residual assigned to the reference pictures earlier than the current
error image data. picture from the picture closer to the current picture. After the
The residual error coding unit 103 performs coding pro values incremented by 1 from 0 are assigned to all the refer
cessing Such as frequency transform and quantization on the 40 ence pictures earlier than the current picture, then the Subse
inputted residual error image data for generating the coded quent values are assigned to the reference pictures later than
data. The bit stream generation unit 104 performs variable the current picture from the picture closer to the current
length coding or the like on the inputted coded data, and picture.
further adds the motion vector information, the coding mode As the second relative indices, in the information indicat
information and other relevant information inputted by the 45 ing display order, the values incremented by 1 from 0 are
mode selection unit 109 to the coded data so as to generate a assigned to the reference pictures later than the current picture
bit stream. from the picture closer to the current picture. After the values
The residual error decoding unit 105 performs decoding incremented by 1 from 0 are assigned to all the reference
processing Such as inverse quantization and inverse fre pictures later than the current picture, then the Subsequent
quency transform on the inputted coded data for generating 50 values are assigned to the reference pictures earlier than the
decoded differential image data. The addition unit 106 adds current picture from the picture closer to the current picture.
the decoded differential image data inputted by the residual For example, in FIG. 4A, when the first relative index
error decoding unit 105 and the predictive image data input “RIdx1 is 0 and the second relative index “RIdx2' is 1, the
ted by the mode selection unit 109 for generating decoded forward reference picture is the B-picture No. 6 and the back
image data. The reference picture memory 107 stores the 55 ward reference picture is the P-picture No. 9. Here, these
generated decoded image data. picture numbers 6 and 9 indicate the display order.
FIG. 4 is an illustration of pictures and relative indices. The Relative indices in a block are represented by variable
relative indices are used for identifying uniquely reference length code words, and the codes with shorter lengths are
pictures stored in the reference picture memory 107, and they assigned to the indices of the Smaller values. Since the picture
are associated to respective pictures as shown in FIG. 4. The 60 which is closest to the current picture is usually selected as a
relative indices are also used for indicating the reference reference picture for inter picture prediction, coding effi
pictures which are to be used for coding blocks using inter ciency is improved by assigning the relative index values in
picture prediction. order of closeness to the current picture.
FIG. 5 is a conceptual illustration of moving picture coded Assignment of reference pictures to relative indices can be
data format used by the moving picture coding apparatus. 65 changed arbitrarily if it is explicitly indicated using buffer
Coded data “Picture' for one picture includes header coded control signal in coded data (RPSL in Header as shown in
data “Header' included in the head of the picture, block coded FIG. 5). This enables to change the reference picture with the
US 8,718,141 B2
9 10
second relative index “0” to an arbitrary reference picture in coding mode indicates the method of coding macroblocks. As
the reference picture memory 107. As shown in FIG. 4B, for P-pictures, it determines any of the coding methods, intra
assignment of reference indices to pictures can be changed, picture coding, interpicture prediction coding using a motion
for example. vector and inter picture prediction coding without using a
Next, the operation of the moving picture coding apparatus motion vector (where motion is handled as “0”). For deter
structured as above will be explained below. mining a coding mode, a method is selected so that a coding
FIG. 6 is an illustration showing the picture sequence in the error is reduced with a small amount of bits.
reordering memory 101, and FIG. 6A shows the sequence in The mode selection unit 109 outputs the determined coding
input order and FIG. 6B shows the reordered sequence. Here, mode to the bit stream generation unit 104. If the coding mode
vertical lines show pictures, and the numbers indicated at the 10 determined by the mode selection unit 109 is inter picture
lower right of the pictures show the picture types (I, P and B) prediction coding, the motion vector which is to be used for
with the first alphabetical letters and the picture numbers the inter picture prediction coding is outputted to the bit
indicating display order with the following numbers. stream generation unit 104 and further stored in the motion
As shown in FIG. 6A, a moving picture is inputted to the vector storage unit 116.
reordering memory 101 on a picture-to-picture basis in dis 15 The mode selection unit 109 generates predictive image
play order, for example. When the pictures are inputted to the databased on the determined coding mode for generating to
reordering memory 101, the coding control unit 110 reorders the difference calculation unit 102 and the addition unit 106.
the pictures inputted to the reordering memory 101 in coding However, when selecting intra picture coding, the mode
order. The pictures are reordered based on the reference rela selection unit 109 does not output predictive image data. In
tions in interpicture prediction coding, and more specifically, addition, when selecting intra picture coding, the mode selec
the pictures are reordered so that the pictures used as refer tion unit 109 controls the switches 111 and 112 to connect to
ence pictures are coded earlier than the pictures which use the “a side and 'c' side respectively, and when selecting inter
reference pictures. picture prediction coding, it controls them to connect to “b'
Here, it is assumed that a P-picture refers to one neighbor side and “d side respectively. The case will be explained
ing previously processed I or P-picture which is located ear 25 below where the mode selection unit 109 selects inter picture
lier or later than the current P-picture in display order, and a prediction coding.
B-picture refers to two neighboring previously processed pic The difference calculation unit 102 receives the image data
tures which are located earlier or later than the current B-pic of the macroblock in the picture P9 read out from the reor
ture in display order. dering memory 101 and the predictive image data outputted
The pictures are coded in the following order. First, a 30 from the mode selection unit 109. The difference calculation
B-picture at the center of B-pictures (3 B-pictures in FIG. 6A, unit 102 calculates the difference between the image data of
for instance) located between two P-pictures is coded, and the macroblock in the picture P9 and the predictive image
then another B-picture closer to the earlier P-picture is coded. data, and generates the residual error image data for output
For example, the pictures B6, B7, B8 and P9 are coded in the ting to the residual error coding unit 103.
order of P9, B7, B6 and B8. 35 The residual error coding unit 103 performs coding pro
In this case, in FIG. 6A, the picture pointed by the arrow cessing Such as frequency transform and quantization on the
refers to the picture at the origin of the arrow. Specifically, inputted residual error image data and thus generates the
B-picture B7 refers to P-pictures P5 and P9, B6 refers to P5 coded data for outputting to the bit stream generation unit 104
and B7, and B8 refers to B7 and P9, respectively. The coding and the residual error decoding unit 105. Here, the coding
control unit 110 reorders the pictures in coding order, as 40 processing such as frequency transform and quantization is
shown in FIG. 6B. performed in every 8 (horizontal)x8 (vertical) pixels or 4
Next, the pictures reordered in the reordering memory 101 (horizontal)x4 (vertical) pixels, for example.
are read out in a unit for every motion compensation. Here, The bit stream generation unit 104 performs variable
the unit of motion compensation is referred to as a macrob length coding or the like on the inputted coded data, and
lock which is 16 (horizontal)x16 (vertical) pixels in size. 45 further adds information Such as motion vectors and a coding
Coding of the pictures P9, B7 B6 and B8 shown in FIG. 6A mode, header information and so on to the coded data for
will be explained below in this order. generating and outputting the bit stream.
(Coding of Picture P9) On the other hand, the residual error decoding unit 105
The P-picture P9 is coded using inter picture prediction performs decoding processing Such as inverse quantization
with reference to one previously processed picture located 50 and inverse frequency transform on the inputted coded data
earlier or later than P9 in display order. In coding P9, the and generates the decoded differential image data for output
picture P5 is the reference picture, as mentioned above. P5 ting to the addition unit 106. The addition unit 106 adds the
has already been coded and the decoded picture thereof is decoded differential image data and the predictive image data
stored in the reference picture memory 107. In coding P-pic inputted by the mode selection unit 109 for generating the
tures, the coding control unit 110 controls switches 113, 114 55 decoded image data, and stores it in the reference picture
and 115 so as to be ON. The macroblocks in the picture P9 memory 107.
read out from the reordering memory 101 are thus inputted to That is the completion of coding one macroblock in the
the motion vectorestimation unit 108, the mode selection unit picture P9. According to the same processing, the remaining
109 and the difference calculation unit 102 in this order. macroblocks of the picture P9 are coded. And after all the
The motion vector estimation unit 108 estimates a motion 60 macroblocks of the picture P9 are coded, the picture B7 is
vector of a macroblock in the picture P9, using the decoded coded.
picture data of the picture P5 stored in the reference picture (Coding of Picture B7)
memory 107 as a reference picture, and outputs the estimated The picture B7 refers to the picture P5 as a forward refer
motion vector to the mode selection unit 109. ence picture and the picture P9 as a backward reference
The mode selection unit 109 determines the mode for cod 65 picture. Since the picture B7 is used as a reference picture for
ing the macroblock in the picture P9 using the motion vector coding other pictures, the coding control unit 110 controls the
estimated by the motion vector estimation unit 108. Here, the switches 113, 114 and 115 so as to be ON, which causes the
US 8,718,141 B2
11 12
macroblocks in the picture B7 read out from the reordering motion vectors which are calculated according to Equation 1
memory 101 to be inputted to the motion vector estimation and Equation 2 and used for direct mode are stored in the
unit 108, the mode selection unit 109 and the difference motion vector storage unit 116.
calculation unit 102. The mode selection unit 109 also generates predictive
Using the decoded picture data of the picture P5 and the image databased on the determined coding mode for output
decoded picture data of the picture P9 which are stored in the ting to the difference calculation unit 102 and the addition
reference picture memory 107 as a forward reference picture unit 106, although it does not output the predictive image data
and a backward reference picture respectively, the motion if it selects the intra picture coding. In addition, when select
vector estimation unit 108 estimates a forward motion vector ing the intra picture coding, the mode selection unit 109
and a backward motion vector of the macroblock in the pic 10 controls the switches 111 and 112 to connect to “a” side and
ture B7. And the motion vector estimation unit 108 outputs “c' side respectively, and when selecting the inter picture
the estimated motion vectors to the mode selection unit 109. prediction coding or direct mode, it controls the switches 111
The mode selection unit 109 determines the coding mode and 112 to connect to “b' side and “d side respectively. The
for the macroblock in the picture B7 using the motion vectors case will be explained below where the mode selection unit
estimated by the motion vector estimation unit 108. Here, it is 15 109 selects the inter picture prediction coding or the direct
assumed that a coding mode for B-pictures can be selected mode.
from among intra picture coding, inter picture prediction The difference calculation unit 102 receives the image data
coding using a forward motion vector, interpicture prediction of the macroblock of the picture B7 read out from the reor
coding using a backward motion vector, inter picture predic dering memory 101 and the predictive image data outputted
tion coding using bi-predictive motion vectors and direct from the mode selection unit 109. The difference calculation
mode. unit 102 calculates the difference between the image data of
Operation of direct mode coding will be explained with the macroblock of the picture B7 and the predictive image
reference to FIG. 7A. FIG. 7A is an illustration showing data, and generates the residual error image data for output
motion vectors in direct mode, and specifically shows the case ting to the residual error coding unit 103.
where the blocka in the picture B7 is coded in direct mode. In 25 The residual error coding unit 103 performs coding pro
this case, a motion vector c, which has been used for coding cessing Such as frequency transform and quantization on the
the block b in the picture P9, is utilized. The block b is inputted residual error image data and thus generates the
co-located with the blocka, and the picture P9 is a backward coded data for outputting to the bit stream generation unit 104
reference picture of the picture B7. The motion vector c is and the residual error decoding unit 105.
stored in the motion vector storage unit 116. The blocka is 30 The bit stream generation unit 104 performs variable
bi-predicted from the forward reference picture P5 and the length coding or the like on the inputted coded data, and
backward reference picture P9 using vectors obtained utiliz further adds information such as motion vectors and a coding
ing the motion vector c. For example, as a method of utilizing mode and so on to that data for generating and outputting a bit
the motion vector c, there is a method of generating motion Stream.
vectors parallel to the motion vector c. In this case, the motion 35 On the other hand, the residual error decoding unit 105
vector d and the motion vector e are used for the picture P5 performs decoding processing Such as inverse quantization
and the picture P9 respectively for coding the block a. and inverse frequency transform on the inputted coded data
In this case where the forward motion vector dis MVF, the and generates the decoded differential image data for output
backward motion vectore is MVB, the motion vector cis MV. ting to the addition unit 106. The addition unit 106 adds the
the temporal distance between the backward reference pic 40 decoded differential image data and the predictive image data
ture P9 for the current picture B7 and the picture P5 which the inputted by the mode selection unit 109 for generating the
block in the backward reference picture P9 refers to is TRD, decoded image data, and stores it in the reference picture
and the temporal distance between the current picture B7 and memory 107.
the forward reference picture P5 is TRF respectively, the That is the completion of coding one macroblock in the
motion vector d MVF and the motion vector e MVB are 45 picture B7. According to the same processing, the remaining
respectively calculated by Equation 1 and Equation 2. Note macroblocks in the picture B7 are coded. And after all the
that the temporal distance between the pictures can be deter macroblocks of the picture B7 are coded, the picture B6 is
mined based on the information indicating the display order coded.
(position) given to the respective pictures or the difference (Coding of Picture B6)
specified by the information. 50 Since the picture B6 is a B-picture, B6 is coded using inter
MF=MXTRF/TRD Equation 1
picture prediction with reference to two previously processed
pictures located earlier or later than B6 in display order. The
MVB=(TRF-TRD)xMV/TRD Equation 2
B-picture B6 refers to the picture P5 as a forward reference
picture and the picture B7 as a backward reference picture, as
where MVF and MVB respectively represent horizontal com 55 described above. Since the picture B6 is not used as a refer
ponents and vertical components of the motion vectors, and ence picture for coding other pictures, the coding control unit
the plus and minus signs indicate directions of the motion 110 controls the Switch 113 to be ON and the Switches 114
VectOrS. and 115 to be OFF, which causes the macroblock of the
By the way, as for selection of a coding mode, a method for picture B6 read out from the reordering memory 101 to be
reducing coding error with a smaller amount of bits is gener 60 inputted to the motion vector estimation unit 108, the mode
ally selected. The mode selection unit 109 outputs the deter selection unit 109 and the difference calculation unit 102.
mined coding mode to the bit stream generation unit 104. If Using the decoded picture data of the picture P5 and the
the coding mode determined by the mode selection unit 109 is decoded picture data of the picture B7 which are stored in the
inter picture prediction coding, the motion vectors used for reference picture memory 107 as a forward reference picture
the inter picture prediction coding is outputted to the bit 65 and a backward reference picture respectively, the motion
stream generation unit 104 and further stored in the motion vector estimation unit 108 estimates the forward motion vec
vector storage unit 116. When the direct mode is selected, the tor and the backward motion vector for the macroblock in the
US 8,718,141 B2
13 14
picture B6. And the motion vector estimation unit 108 outputs vector needs to be stored. The blocka is bi-predicted from the
the estimated motion vectors to the mode selection unit 109. forward reference picture P5 and the backward reference
The mode selection unit 109 determines the coding mode picture B7 using the motion vectors generated utilizing the
for the macroblock in the picture B6 using the motion vectors motion vector c. For example, if a method of generating
estimated by the motion vector estimation unit 108. motion vectors parallel to the motion vector c is used, as is the
Here, the first example of direct mode coding operation for case of the above-mentioned first example, motion vectors
the macroblock in the picture B6 will be explained with used for coding the block a are the motion vector d and the
reference to FIG. 7B. FIG. 7B is an illustration showing motion vectore for the picture P5 and the picture B7 respec
motion vectors in direct mode, and specifically showing the tively.
case where the block a in the picture B6 is coded in direct 10 In this case, the forward motion vector d MVF and the
mode. In this case, a motion vector c, which has been used for backward motion vector e MVB of the block a are respec
coding a block b in the picture B7 is utilized. The block b is tively calculated by above-mentioned Equation 1 and Equa
co-located with the blocka, and the picture B7 is a backward tion 2, as in the case of the first example.
reference picture of the picture B6. Here, it is assumed that the As described above, in direct mode, since the forward
block b is coded by forward reference only or bi-predictive 15 motion vector of a backward reference B-picture which has
reference and the forward motion vector of the block b is the been substantially used for coding the B-picture in direct
motion vector c. It is also assumed that the motion vector c is mode is scaled, there is no need to transmit the motion vector
stored in the motion vector storage unit 116. The blocka is information, and motion prediction efficiency can be
bi-predicted from the forward reference picture P5 and the improved even if the co-located block in the backward refer
backward reference picture B7 using motion vectors gener ence picture has been coded in direct mode. Accordingly,
ated utilizing the motion vector c. For example, ifa method of coding efficiency can be improved. In addition, by using
generating motion vectors parallel to the motion vector c is reference pictures which are temporally closest available in
used, as is the case of the above-mentioned picture B7, the display order as a forward reference picture and a backward
motion vector d and the motion vector e are used for the reference picture, coding efficiency can be increased.
picture P5 and the picture B7 respectively for coding the 25 Next, the third example of direct mode will be explained
block a. with reference to FIG.7C. FIG. 7C is an illustration showing
In this case where the forward motion vector dis MVF, the motion vectors in direct mode, and specifically showing the
backward motion vectore is MVB, the motion vector cis MV. case where the block a in the picture B6 is coded in direct
the temporal distance between the backward reference pic mode. In this case, the motion vector which has been used for
ture B7 for the current picture B6 and the picture P5 which the 30 coding the blockb in the picture B7 is utilized. The picture B7
blockb in the backward reference picture B7 refers to is TRD, is a backward reference picture for the picture B6, and the
and the temporal distance between the current picture B6 and block b in the picture B7 is co-located with the blocka in the
the forward reference picture P5 is TRF respectively, the picture B6. Here, it is assumed that the blockb has been coded
motion vector d MVF and the motion vector e MVB are using a backward motion vector only and the backward
respectively calculated by above-mentioned Equation 1 and 35 motion vector used for coding the block b is a motion vector
Equation 2. Note that the temporal distance between the f. Specifically, the motion vector f is assumed to be stored in
pictures can be determined based on the information indicat the motion vector storage unit 116. The blocka is bi-predicted
ing display order of the pictures or the difference specified by from the forward reference picture P5 and the backward
the information, for instance. reference picture B7 using motion vectors generated utilizing
As described above, in direct mode, by scaling the forward 40 the motion vector f. For example, if a method of generating
motion vector of a backward reference B-picture, there is no motion vectors parallel to the motion vectorf is used, as is the
need to transmit motion vector information, and motion pre case of the above-mentioned first example, motion vectors
diction efficiency can be improved. Accordingly, coding effi used for coding the block a are the motion vector g and the
ciency can be improved. In addition, by using reference pic motion vector h for the picture P5 and the picture B7 respec
tures temporally closest available in display order as a 45 tively.
forward reference picture and a backward reference picture, In this case, where the forward motion vectorg is MVF, the
coding efficiency can be increased. backward motion vector his MVB, the motion vectorf is MV.
Next, the second example of the direct mode will be the temporal distance between the backward reference pic
explained with reference to FIG. 7B. In this case, the motion ture B7 for the current picture B6 and the picture P9 which the
vector, which has been used for coding the block b in the 50 block in the backward reference picture B7 is TRD, the tem
picture B7, is utilized. The block b is co-located with the poral distance between the current picture B6 and the forward
blocka, and the picture B7 is a backward reference picture for reference picture P5 is TRF, and the temporal distance
the picture B6. Here, it is assumed that the block b has been between the current picture B6 and the backward reference
coded in direct mode and the forward motion vector which picture B7 is TRB respectively, the motion vectorg MVF and
has been substantially used for coding the block b is the 55 the motion vector h MVB are respectively calculated by
motion vector c. Specifically, the motion vector c is obtained Equation 3 and Equation 4.
by Scaling the motion vector used for coding a block i, co MF=-TRFXM/TRD Equation 3
located with the blockb, in the picture P9 that is the backward
reference picture for the picture B7. The motion vector c MB=TRBXM/TRD Equation 4
stored in the motion vector storage unit 116 is used, or the 60
motion vector c is obtained by reading out from the motion As described above, in direct mode, since the backward
vector storage unit 116 the motion vector of the blocki in the motion vector of a co-located block in a backward reference
picture P9 which has been used for coding the block b in B-picture which has been used for coding the block is scaled,
direct mode and calculating based on that motion vector. there is no need to transmit motion vector information, and
When the motion vector which is obtained by scaling for 65 motion prediction efficiency can be improved even if the
coding the blockb in the picture B7 in direct mode is stored in co-located block in the backward reference picture has only
the motion vector storage unit 116, only the forward motion the backward motion vector. Accordingly, the coding effi
US 8,718,141 B2
15 16
ciency can be improved. In addition, by using reference pic been used for coding the block fin the picture P9 is utilized.
tures which are temporally closest available in display order The picture P9 is located later than the picture B6, and the
as a forward reference picture and a backward reference block f is co-located with the blocka in the picture B6. The
picture, coding efficiency can be increased. motion vector g is stored in the motion vector storage unit
Next, the fourth example of direct mode will be explained 116. The blocka is bi-predicted from the forward reference
with reference to FIG. 7D. FIG. 7D is an illustration showing picture P5 and the backward reference picture B7 using
motion vectors in direct mode, and specifically showing the motion vectors generated utilizing the motion vector g. For
case where the block a in the picture B6 is coded in direct example, if a method of generating motion vectors parallel to
mode. In this case, the motion vector which has been used for the motion vector g is used, as is the case of the above
coding the blockb in the picture B7 is utilized. The picture B7 10
mentioned first example, motion vectors used for coding the
is the backward reference picture for the picture B6, and the blocka are the motion vector hand the motion vectori for the
block b is co-located with the blocka in the picture B6. Here, picture P5 and the picture B7 respectively for coding the
it is assumed that the block b has been coded using the block a.
backward motion vector only, as is the case of the third
example, and the backward motion vector used for coding the 15 In this case, where the forward motion vector his MVF, the
block b is the motion vectorf. Specifically, the motion vector backward motion vectori is MVB, the motion vectorg is MV.
fis assumed to be stored in the motion vector storage unit 116. the temporal distance between the picture P9 which is located
The block a is bi-predicted from the reference picture P9 later in display order than the current picture B6 and the
which is referred to by the motion vectorf and the backward picture P5 which the blockfin the picture P9 refers to is TRD,
reference picture B7 using motion vectors generated utilizing the temporal distance between the current picture B6 and the
the motion vector f. For example if a method of generating forward reference picture P5 is TRF, and the temporal dis
motion vectors parallel to the motion vectorf is used, as is the tance between the current picture B6 and the backward ref
case of the above-mentioned first example, motion vectors erence picture B7 is TRB respectively, the motion vector h
used for coding the block a are the motion vector 9 and the MVF and the motion vector i MVB are respectively calcu
motion vector h for the picture P9 and the picture B7 respec 25 lated by Equation 1 and Equation 5.
tively.
In this case, where the forward motion vectorg is MVF, the MB=-TRBXM/TRD Equation 5
backward motion vector his MVB, the motion vectorf is MV.
the temporal distance between the backward reference pic As described above, in direct mode, by Scaling the motion
ture B7 for the current picture B6 and the picture P9 which the 30 vector of the P-picture which is located later in display order,
block in the backward reference picture B7 refers to is TRD, there is no need to store the motion vector of a B-picture if the
and the temporal distance between the current picture B6 and B-picture is the backward reference picture, and there is also
the picture P9 which the block b in the backward reference no need to transmit the motion vector information. In addi
picture B7 refers to is TRF respectively, the motion vectorg tion, by using reference pictures which are temporally closest
MVF and the motion vector h MVB are respectively calcu 35 in display order as a forward reference picture and a backward
lated by Equation 1 and Equation 2. reference picture, coding efficiency can be increased.
As described above, in direct mode, by Scaling the back Next, the seventh example of the direct mode will be
ward motion vector of a co-located block in a backward explained with reference to FIG. 8C. FIG. 8C is an illustration
reference B-picture which has been used for coding the block, showing motion vectors in direct mode, and specifically
there is no need to transmit motion vector information, and 40 showing the case where the blocka in the picture B6 is coded
motion prediction efficiency can be improved even if the in direct mode. This example shows the case where the above
co-located block in the backward reference picture has only mentioned assignment of relative indices to the picture num
the backward motion vector. Accordingly, coding efficiency bers is changed (remapped) and the picture P9 is a backward
can be improved. In addition, by using a picture referred to by reference picture. In this case, the motion vector g which has
the backward motion vector as a forward reference picture, 45 been used for coding the block fin the picture P9 is utilized.
and a reference picture which is temporally closest available The picture P9 is the backward reference picture for the
in display order as a backward reference picture, coding effi picture B7, and the block fisco-located with the blocka in the
ciency can be increased. picture B6. The motion vectorg is stored in the motion vector
Next, the fifth example of the direct mode will be explained storage unit 116. The blocka is bi-predicted from the forward
with reference to FIG. 8A. FIG. 8A is an illustration showing 50 reference picture P5 and the backward reference picture P9
motion vectors in direct mode, and specifically showing the using motion vectors generated utilizing the motion vectorg.
case where the block a of the picture B6 is coded in direct For example, if a method of generating motion vectors par
mode. In this case, on the assumption that the value of the allel to the motion vector g, as is the case of the above
motion vectors is “0”, bi-predictive reference is performed for mentioned first example, motion vectors used for coding the
motion compensation, using the picture P5 as a forward ref 55 blocka are the motion vector hand the motion vectori for the
erence picture and the picture B7 as a backward reference picture P5 and the picture P9 respectively.
picture. In this case, where the forward motion vector his MVF, the
As mentioned above, by forcing the motion vector “O'” in backward motion vectori is MVB, the motion vectorg is MV.
direct mode, when the direct mode is selected, there is no need the temporal distance between the backward reference pic
to transmit the motion vector information nor to Scale the 60 ture P9 for the current picture B6 and the picture P5 which the
motion vector, and thus the processing Volume can be block fin the picture P9 refers to is TRD, and the temporal
reduced. distance between the current picture B6 and the forward
Next, the sixth example of the direct mode will be reference picture P5 is TRF respectively, the motion vector h
explained with reference to FIG. 8B. FIG.8B is an illustration MVF and the motion vector i MVB are respectively calcu
showing motion vectors in direct mode, and specifically 65 lated by Equation 1 and Equation 2.
showing the case where the blocka in the picture B6 is coded As described above, in direct mode, the motion vector of
in direct mode. In this case, the motion vector g which has the previously coded picture can be scaled even if the relative
US 8,718,141 B2
17 18
indices to the picture numbers are remapped, and when the reference picture, as described above. Since the picture B8 is
direct mode is selected, there is no need to transmit the motion not used as a reference picture for coding other pictures, the
vector information. coding control unit 110 controls the switch 113 to be ON and
When the blocka in the picture B6 is coded in direct mode, the Switches 114 and 115 to be OFF, which causes the mac
the block in the backward reference picture for the picture B6 5 roblocks in the picture B8 read out from the reordering
which is co-located with the blocka is coded by the forward memory 101 to be inputted to the motion vector estimation
reference only, bi-predictive reference, or direct mode. And unit 108, the mode selection unit 109 and the difference
when a forward motion vector has been used for this coding, calculation unit 102.
this forward motion vector is scaled, and the blocka is coded Using the decoded picture data of the picture B7 and the
in direct mode, as is the case of the above-mentioned first, 10 decoded picture data of the picture P9 which are stored in the
second or seventh example. On the other hand, when the reference picture memory 107 as a forward reference picture
block co-located with the block a has been coded by back and a backward reference picture respectively, the motion
ward reference only using a backward motion vector, this vector estimation unit 108 estimates the forward motion vec
backward motion vector is scaled, and the blocka is coded in tor and the backward motion vector for the macroblock in the
direct mode, as is the case of the above-mentioned third or 15 picture B8. And the motion vector estimation unit 108 outputs
fourth example. the estimated motion vectors to the mode selection unit 109.
Above-mentioned direct mode is applicable not only to the The mode selection unit 109 determines the coding mode
case where a time interval between pictures is fixed but also to for the macroblock in the picture B8 using the motion vectors
the case where it is variable. estimated by the motion vector estimation unit 108.
The mode selection unit 109 outputs the determined coding 20 Here, the case where the macroblock in the picture B8 is
mode to the bit stream generation unit 104. Also, the mode coded using the direct mode will be explained with reference
selection unit 109 generates predictive image databased on to FIG.8D. FIG.8D is an illustration showing motion vectors
the determined coding mode and outputs it to the difference in direct mode, and specifically showing the case where a
calculation unit 102. However, if selecting intra picture cod blocka in the picture B8 is coded in direct mode. In this case,
ing, the mode selection unit 109 does not output predictive 25 a motion vector c which has been used for coding a blockb in
image data. The mode selection unit 109 controls the switches the backward picture P9 is utilized. The reference picture P9
111 and 112 so as to be connected to “a” side and 'c' side is located later than the picture B8, and the block b in the
respectively if selecting intra picture coding, and controls the picture P9 is co-located with the blocka. Here, it is assumed
switches 111 and 112 so as to be connected to “b'side and “d' that the block b has been coded by forward reference and the
side if selecting inter picture prediction coding or a direct 30 forward motion vector for the block b is the motion vector c.
mode. If the determined coding mode is inter picture predic The motion vector c is stored in the motion vector storage unit
tion coding, the mode selection unit 109 outputs the motion 116. The blocka is bi-predicted from the forward reference
vectors used for the inter picture prediction coding to the bit picture B7 and the backward reference picture P9 using
stream generation unit 104. Since the picture B6 is not used as motion vectors generated utilizing the motion vector c. For
a reference picture for coding other pictures, there is no need 35 example, if a method of generating motion vectors parallel to
to store the motion vectors used for the inter picture predic the motion vector c is used, as is the case of the above
tion coding in the motion vector storage unit 116. The case mentioned picture B7, the motion vector d and the motion
will be explained below where the mode selection unit 109 vectore are used for the picture B7 and the picture P9 respec
selects the inter picture prediction coding or the direct mode. tively for coding the block a.
The difference calculation unit 102 receives the image data 40 In this case where the forward motion vector dis MVF, the
of the macroblock in the picture B6 read out from the reor backward motion vectore is MVB, the motion vector cis MV.
dering memory 101 and the predictive image data outputted the temporal distance between the backward reference pic
from the mode selection unit 109. The difference calculation ture P9 for the current picture B8 and the picture P5 which the
unit 102 calculates the difference between the image data of blockb in the backward reference picture P9 refers to is TRD,
the macroblock in the picture B6 and the predictive image 45 the temporal distance between the current picture B8 and the
data and generates the residual error image data for outputting forward reference picture B7 is TRF, and the temporal dis
to the residual error coding unit 103. The residual errorcoding tance between the current picture B8 and the backward ref
unit 103 performs coding processing Such as frequency trans erence picture P9 is TRB respectively, the motion vector d
form and quantization on the inputted residual error image MVF and the motion vector e MVB are respectively calcu
data, and thus generates the coded data for outputting to the 50 lated by Equation 1 and Equation 5.
bit stream generation unit 104. As described above, in direct mode, by scaling the forward
The bit stream generation unit 104 performs variable motion vector of the backward reference picture, when the
length coding or the like on the inputted coded data, further direct mode is selected, there is no need to transmit the motion
adds information Such as motion vectors and a coding mode vector information and the motion prediction efficiency can
and so on to the data, and generates the bit stream for output- 55 be improved. Accordingly, coding efficiency can be
ting. improved. In addition, by using reference pictures which are
That is the completion of coding one macroblock in the temporally closest available in display order as forward and
picture B6. According to the same processing, the remaining backward reference pictures, coding efficiency can be
macroblocks in the picture B6 are coded. And after all the increased.
macroblocks in the picture B6 are coded, the picture B8 is 60 Above-mentioned direct mode is applicable not only to the
coded. case where a time interval between pictures is fixed but also to
(Coding of Picture B8) the case where it is variable.
Since a picture B8 is a B-picture, inter picture prediction The mode selection unit 109 outputs the determined coding
coding is performed for the picture B8 with reference to two mode to the bit stream generation unit 104. Also, the mode
previously processed pictures located earlier or later than B6 65 selection unit 109 generates predictive image databased on
in display order. The B-picture B8 refers to the picture B7 as the determined coding mode and outputs it to the difference
a forward reference picture and the picture P9 as a backward calculation unit 102. However, if selecting intra picture cod
US 8,718,141 B2
19 20
ing, the mode selection unit 109 does not output predictive dered in coding order (a bit stream). FIG. 13 is a hierarchical
image data. The mode selection unit 109 controls the switches diagram of the picture prediction structure corresponding to
111 and 112 so as to be connected to “a” side and 'c' side FIG.9A. In the picture prediction structure as shown in FIG.
respectively if selecting intra picture coding, and controls the 9A, the pictures closest in display order from the previously
switches 111 and 112 so as to be connected to “b'side and “d' processed pictures are coded first, as shown in FIG. 13. For
side if selecting inter picture prediction coding or direct example, if the pictures P5 and P9 have been coded, the
mode. If the determined coding mode is inter picture predic pictures B6 and B8 are to be coded next. If the pictures P5, B6,
tion coding, the mode selection unit 109 outputs the motion B8 and P9 have been coded, the picture B7 is to be coded next.
vectors used for the inter picture prediction coding to the bit FIG. 10 shows the case where 5 B-pictures are located
stream generation unit 104. Since the picture B8 is not be used 10 between I-pictures and P-pictures and the B-picture which is
as a reference picture for coding other pictures, there is no farthest from the previously processed picture is selected for
need to store the motion vectors used for the inter picture coding first. FIG. 10A is a diagram showing prediction rela
prediction coding in the motion vector storage unit 116. The tions between respective pictures arranged in display order,
case will be explained below where the mode selection unit and FIG. 10B is a diagram showing the sequence of pictures
109 selects the inter picture prediction coding or direct mode. 15 reordered in coding order (a bit stream). FIG. 14 is a hierar
The difference calculation unit 102 receives the image data chical diagram of the picture prediction structure correspond
of the macroblock in the picture B8 read out from the reor ing to FIG. 10A. In the picture prediction structure as shown
dering memory 101 and the predictive image data outputted in FIG. 10A, the coding order is determined by giving a top
from the mode selection unit 109. The difference calculation priority to the pictures farthest in display order from the
unit 102 calculates the difference between the image data of previously processed pictures, as shown in FIG. 14. For
the macroblock in the picture B8 and the predictive image example, the picture farthest from an I-picture or a P-picture
data and generates the residual error image data for outputting is the B-picture in the center of the consecutive B-pictures.
to the residual error coding unit 103. The residual errorcoding Therefore, if the pictures P7 and P13 have been coded, the
unit 103 performs coding processing Such as frequency trans picture B10 is to be coded next. If the pictures P7, B10 and
form and quantization on the inputted residual error image 25 P13 have been coded, the pictures B8, B9, B11 and B12 are to
data and thus generates the coded data for outputting to the bit be coded next.
stream generation unit 104. FIG. 11 shows the case where 5 B-pictures are located
The bit stream generation unit 104 performs variable between I-pictures and P-pictures and the B-picture which is
length coding or the like on the inputted coded data, further closest from the previously processed picture is selected for
adds information Such as motion vectors and a coding mode 30 coding first. FIG. 11A is a diagram showing prediction rela
and so on to the data, and generates the bit stream for output tions between respective pictures arranged in display order,
ting. and FIG. 11B is a diagram showing the sequence of pictures
That is the completion of coding one macroblock in the reordered in coding order (a bit stream). FIG. 15 is a hierar
picture B8. According to the same processing, the remaining chical diagram of the picture prediction structure correspond
macroblocks in the picture B8 are coded. 35 ing to FIG. 11A. In the picture prediction structure as shown
According to the above-mentioned respective coding pro in FIG. 11A, the pictures closest in display order from the
cedures for the pictures P9, B7, B6 and B8, other pictures are previously processed pictures are coded first, as shown in
coded depending on their picture types and temporal loca FIG. 15. For example, if the pictures P5 and P9 have been
tions in display order. coded, the pictures B8 and B12 are to be coded next. If the
In the above-mentioned embodiment, the moving picture 40 pictures P5, B8, B12 and P9 have been coded, the pictures B9
coding method according to the present invention has been and B11 are to be coded next. Furthermore, if the pictures P5,
explained taking the case where the picture prediction struc B8, B9, B11, B12 and P9 have been coded, the picture B10 is
ture as shown in FIG. 6A is used as an example. FIG. 12 is an to be coded next.
illustration showing this picture prediction structure hierar As described above, according to the moving picture cod
chically. In FIG. 12, arrows indicate prediction relations, in 45 ing method of the present invention, when inter picture pre
which the pictures pointed by the arrows refer to the pictures diction coding is performed on a plurality of B-pictures
located at the origins of the arrows. In the picture prediction located between I-pictures and P-pictures using bi-predictive
structure as shown in FIG. 6A, the coding order is determined reference, they are coded in another order than display order.
by giving a top priority to the pictures which are farthest from For that purpose, the pictures located as close to the current
the previously processed pictures in display order, as shown 50 picture as possible in display order are used as forward and
in FIG. 12. For example, the picture farthest from an I-picture backward pictures. As a reference picture, a B-picture is also
or a P-picture is that located in the center of the consecutive used if it is available. When a plurality of B-pictures located
B-pictures. Therefore, if the picture P5 and P9 have been between I-pictures and P-pictures are coded in different order
coded, the picture B7 is to be coded next. And if the pictures from display order, the picture farthest from the previously
P5, B7 and P9 have been coded, the pictures B6 and B8 are to 55 processed picture is to be coded first. Or, when a plurality of
be coded next. B-pictures located between I-pictures and P-pictures are
In addition, the moving picture coding method according coded in different order from display order, the picture closest
to the present invention can be used for other picture predic from the previously processed picture is to be coded first.
tion structures than those as shown in FIG. 6 and FIG. 12, so According to the moving picture coding method of the
as to produce the effects of the present invention. FIGS. 9-11 60 present invention, above-mentioned operation enables to use
show the examples of other picture prediction structures. a picture closer to a current B-picture in display order as a
FIG. 9 shows the case where 3 B-pictures are located reference picture for coding it. Prediction efficiency is thus
between I-pictures and P-pictures and the B-picture closest increased for motion compensation and coding efficiency is
from the previously processed picture is selected for coding increased.
first. FIG. 9A is a diagram showing prediction relations 65 In addition, according to the moving picture coding
between respective pictures arranged in display order, and method of the present invention, for coding a block in a
FIG.9B is a diagram showing the sequence of pictures reor B-picture in direct mode with reference to a B-picture previ
US 8,718,141 B2
21 22
ously coded as a backward reference picture, if the co-located As mentioned above, in direct mode, by Scaling a motion
block in the backward reference B-picture has been coded by vector of a later P-picture, if the backward reference picture is
forward reference or bi-predictive reference, a motion vector a B-picture, there is no need to store the motion vectors of the
obtained by scaling the forward motion vector of the back B-picture and there is no need to transmit the motion vector
ward reference B-picture is used as a motion vector in direct information, and thus prediction efficiency can be increased.
mode. In addition, by using a temporally closest reference picture as
As mentioned above, in direct mode, by Scaling a forward a forward reference picture, coding efficiency can be
motion vector of a backward reference B-picture, there is no improved.
need to transmit motion vector information, and prediction 10
When assignment of relative indices to picture numbers is
efficiency can be increased. In addition, by using a reference changed and a co-located block in a backward reference
picture temporally closest in display order as a forward ref picture has been coded by forward reference, motion vectors
erence picture, coding efficiency can be increased. obtained by scaling that forward motion vector are used as
Or, if a co-located block in a backward reference B-picture motion vectors in direct mode.
is coded in direct mode, a motion vector obtained by Scaling 15 As mentioned above, in direct mode, a motion vector of a
the forward motion vector substantially used in direct mode is previously coded picture can be scaled even if assignment of
used as a motion vector in direct mode. relative indices to picture numbers is changed, and there is no
As mentioned above, in direct mode, by Scaling a forward need to transmit motion vector information.
motion vector of a backward reference B-picture which has In the present embodiment, the case has been explained
been substantially used for the direct mode coding, there is no where motion compensation is made in every 16 (horizon
need to transmit motion vector information, and prediction tal)x16 (vertical) pixels and residual error image data is coded
efficiency can be increased even if the co-located block in the in every 8 (horizontal)x8 (vertical) pixels or 4 (horizontal)x4
backward reference picture is coded in direct mode. In addi (vertical) pixels, but other size (number of pixels included)
tion, coding efficiency can be improved by using a temporally may be applied.
closest reference picture as a forward reference picture. 25 Also, in the present embodiment, the case has been
Or, if a co-located block in a backward reference B-picture explained where consecutive 3 or 5 B-pictures are located, but
is coded by backward reference, motion vectors obtained by other number of pictures may be located.
Scaling the backward motion vector of the block is used as Further, in the present embodiment, the case has been
motion vectors in direct mode. explained where one of intra picture coding, inter picture
As mentioned above, in direct mode, by Scaling a backward 30
prediction coding using motion vectors and inter picture pre
motion vector which has been used for coding a co-located diction coding without using motion vectors is selected as a
block in the backward reference B-picture, there is no need to coding mode for P-pictures, and one of intra picture coding,
transmit motion vector information, and prediction efficiency interpicture prediction coding using a forward motion vector,
can be increased even if the co-located block in the backward
reference picture has only a backward motion vector. In addi 35 inter picture prediction coding using a backward motion vec
tion, by using a temporally closest reference picture as a tor, inter picture prediction coding using a bi-predictive
forward reference picture, coding efficiency can be improved. motion vectors and direct mode is selected for B-pictures, but
Or, if a co-located block in a backward reference B-picture other coding mode may be used.
is coded by backward reference, motion vectors obtained by Also, in the present embodiment, seven examples of direct
Scaling the backward motion vector used for that coding, with 40 mode have been explained, but a method which is uniquely
reference to the picture referred to by this backward motion determined in every macroblock or block may be used, or any
vector and the backward reference picture, are used as motion of a plurality of methods in every macroblock or block may be
vectors in direct mode. selected. If a plurality of methods are used, information indi
As mentioned above, in direct mode, by Scaling a backward cating which type of direct mode has been used is described in
motion vector which has been used for coding a co-located 45 a bit stream.
block in the backward reference B-picture, there is no need to In addition, in the present embodiment, the case has been
transmit motion vector information, and prediction efficiency explained where a P-picture is coded with reference to one
can be increased even if the co-located block in the backward previously coded I or P-picture which is located temporally
reference picture has only a backward motion vector. Accord earlier or later in display order than the current P-picture, and
ingly, coding efficiency can be improved. In addition, by 50 a B-picture is coded with reference to two previously pro
using a picture referred to by the backward motion vector as cessed neighboring pictures which are located earlier or later
a forward reference picture and a reference picture tempo in display order than the current B-picture, respectively.
rally closest available in display order as a backward refer However, in the case of a P-picture, the P-picture may be
ence picture, coding efficiency can be increased. coded with reference to at most one picture for each block
Or, in direct mode, a motion vector which is forced to be set 55 from among a plurality of previously coded I or P pictures as
to “O'” is used. candidate reference pictures, and in the case of a B-picture,
By forcing a motion vector to be set to “O'” in direct mode, the B-picture may be coded with reference to at most two
when the direct mode is selected, there is no need to transmit pictures for each block from among a plurality of previously
the motion vector information nor to scale the motion vector, coded neighboring pictures which are located temporally ear
and therefore the processing Volume can be reduced. 60 lier or later in display order as candidate reference pictures.
In addition, according to the moving picture coding In addition, when storing motion vectors in the motion
method of the present invention, for coding a block in a vector storage unit 116, the mode selection unit 109 may store
B-picture in direct mode with reference to a B-picture which both forward and backward motion vectors or only a forward
has been previously coded as a backward reference picture, a motion vector, if a current block is coded by bi-predictive
motion vector obtained by scaling the forward motion vector 65 reference or in direct mode. If it stores only the forward
which has been used for coding the co-located block in the motion vector, the Volume stored in the motion vector storage
later P-picture is used as a motion vector in direct mode. unit 116 can be reduced.
US 8,718,141 B2
23 24
Second Embodiment 1403 controls the switches 1409 and 1410 so as to be con
nected to “b' side and “d' side respectively.
FIG.16 is a block diagram showing a structure of a moving The mode decoding unit 1403 also outputs the coding
picture decoding apparatus using a moving picture decoding mode selection information to the motion compensation
method according to an embodiment of the present invention. decoding unit 1405. The case where the inter picture predic
As shown in FIG. 16, the moving picture decoding appa tion coding is selected as a coding mode will be explained
ratus includes a bit stream analysis unit 1401, a residual error below. The residual error decoding unit 1402 decodes the
decoding unit 1402, a mode decoding unit 1403, a frame inputted residual error coded data to generate residual error
memory control unit 1404, a motion compensation decoding 10
image data. The residual error decoding unit 1402 outputs the
unit 1405, a motion vector storage unit 1406, a frame memory generated residual error image data to the switch 1409. Since
1407, an addition unit 1408 and switches 1409 and 1410. the switch 1409 is connected to “b' side, the residual error
The bit stream analysis unit 1401 extracts various types of image data is outputted to the addition unit 1408.
data Such as coding mode information and motion vector The motion compensation decoding unit 1405 obtains
information from the inputted bit stream. The residual error 15 motion compensation image data from the frame memory
decoding unit 1402 decodes the residual error coded data 1407 based on the inputted motion vector information and the
inputted from the bit stream analysis unit 1401 and generates like. The picture P9 has been coded with reference to the
residual error image data. The mode decoding unit 1403 picture P5, and the picture P5 has been already decoded and
controls the Switches 1409 and 1410 with reference to the stored in the frame memory 1407. So, the motion compensa
coding mode information extracted from the bit stream. tion decoding unit 1405 obtains the motion compensation
The frame memory control unit 1404 outputs the decoded image data from the picture data of the picture P5 stored in the
picture data stored in the frame memory 1407 as output pic frame memory 1407, based on the motion vector information.
tures based on the information indicating the display order of The motion compensation image data generated in this man
the pictures inputted from the bit stream analysis unit 1401. ner is outputted to the addition unit 1408.
The motion compensation decoding unit 1405 decodes the 25 When decoding P-pictures, the motion compensation
information of the reference picture numbers and the motion decoding unit 1405 stores the motion vector information in
vectors, and obtains motion compensation image data from the motion vector storage unit 1406.
the frame memory 1407 based on the decoded reference The addition unit 1408 adds the inputted residual error
picture numbers and motion vectors. The motion vector Stor image data and motion compensation image data to generate
age unit 1406 stores motion vectors. 30 decoded image data. The generated decoded image data is
The addition unit 1408 adds the residual error coded data outputted to the frame memory 1407 via the switch 1410.
inputted from the residual error decoding unit 1402 and the That is the completion of decoding one macroblock in the
motion compensation image data inputted from the motion picture P9. According to the same processing, the remaining
compensation decoding unit 1405 for generating the decoded macroblocks in the picture P9 are decoded in sequence. And
image data. The frame memory 1407 stores the generated 35 after all the macroblocks in the picture P9 are decoded, the
decoded image data. picture B7 is decoded.
Next, the operation of the moving picture decoding appa (Decoding of Picture B7)
ratus as structured as above will be explained. Here, it is Since the operations of the bit stream analysis unit 1401,
assumed that the bit stream generated by the moving picture the mode decoding unit 1403 and the residual error decoding
coding apparatus is inputted to the moving picture decoding 40 unit 1402 until generation of residual error image data are
apparatus. Specifically, it is assumed that a P-picture refers to same as those for decoding the picture P9, the explanation
one previously processed neighboring I or P-picture which is thereof will be omitted.
located earlier or later than the current P-picture in display The motion compensation decoding unit 1405 generates
order, and a B-picture refers to two previously coded neigh motion compensation image data based on the inputted
boring pictures which are located earlier or later than the 45 motion vector information and the like. The picture B7 is
current B-picture in display order. coded with reference to the picture P5 as a forward reference
In this case, the pictures in the bit stream are arranged in the picture and the picture P9 as a backward reference picture,
order as shown in FIG. 6B. Decoding processing of pictures and these pictures P5 and P9 have already been decoded and
P9, B7, B6 and B8 will be explained below in this order. stored in the frame memory 1407.
(Decoding of Picture P9) 50 If inter picture bi-prediction coding is selected as a coding
The bit stream of the picture P9 is inputted to the bit stream mode, the motion compensation decoding unit 1405 obtains
analysis unit 1401. The bit stream analysis unit 1401 extracts the forward reference picture data from the frame memory
various types of data from the inputted bit stream. Here, 1407 based on the forward motion vector information. It also
various types of data mean mode selection information, obtains the backward reference picture data from the frame
motion vector information and others. The extracted mode 55 memory 1407 based on the backward motion vector informa
selection information is outputted to the mode decoding unit tion. Then, the motion compensation decoding unit 1405
1403. The extracted motion vector information is outputted to averages the forward and backward reference picture data to
the motion compensation decoding unit 1405. And the generate motion compensation image data.
residual error coded data is outputted to the residual error When direct mode is selected as a coding mode, the motion
decoding unit 1402. 60 compensation decoding unit 1405 obtains the motion vector
The mode decoding unit 1403 controls the switches 1409 of the picture P9 stored in the motion vector storage unit 1406.
and 1410 with reference to the coding mode selection infor Using this motion vector, the motion compensation decoding
mation extracted from the bit stream. If intra picture coding is unit 1405 obtains the forward and backward reference picture
selected as a coding mode, the mode decoding unit 1403 data from the frame memory 1407. Then, the motion com
controls the Switches 1409 and 1410 so as to be connected to 65 pensation decoding unit 1405 averages the forward and back
“a side and 'c' side respectively. If inter picture prediction ward reference picture data to generate motion compensation
coding is selected as a coding mode, the mode decoding unit image data.
US 8,718,141 B2
25 26
The case where the direct mode is selected as a coding When the direct mode is selected as a coding mode, the
mode will be explained with reference to FIG. 7A again. motion compensation decoding unit 1405 obtains the motion
Here, it is assumed that the blocka in the picture B7 is to be vector of the picture B7 stored in the motion vector storage
decoded and the block b in the picture P9 is co-located with unit 1406. Using this motion vector, the motion compensation
the block a. The motion vector of the block b is the motion decoding unit 1405 obtains the forward and backward refer
vector c, which refers to the picture P5. In this case, the ence picture data from the frame memory 1407. Then, the
motion vector d which is obtained utilizing the motion vector motion compensation decoding unit 1405 averages the for
c and refers to the picture P5 is used as a forward motion ward and backward reference picture data to generate motion
vector, and the motion vectore which is obtained utilizing the compensation image data.
motion vector c and refers to the picture P9 is used as a
10 The first example of the case where the direct mode is
backward motion vector. For example, as a method of utiliz selected as a coding mode will be explained with reference to
ing the motion vector c, there is a method of generating FIG. 7B again. Here, it is assumed that the block a in the
picture B6 is to be decoded and the block b in the picture B7
motion vectors parallel to the motion vector c. The motion is co-located with the blocka. The blockb has been coded by
compensation image data is obtained by averaging the for 15 forward reference inter picture prediction or bi-predictive
ward and backward reference data obtained based on these reference inter picture prediction, and the forward motion
motion vectors. vector of the blockb is the motion vectoric, which refers to the
In this case where the forward motion vector dis MVF, the picture P5. In this case, the motion vector d which is obtained
backward motion vectore is MVB, the motion vector cis MV. utilizing the motion vector c and refers to the picture P5 is
the temporal distance between the backward reference pic used as a forward motion vector, and the motion vector e
ture P9 for the current picture B7 and the picture P5 which the which is obtained utilizing the motion vector c and refers to
blockb in the backward reference picture P9 refers to is TRD, the picture B7 is used as a backward motion vector. For
and the temporal distance between the current picture B7 and example, as a method of utilizing the motion vector c, there is
the forward reference picture P5 is TRF respectively, the a method of generating motion vectors parallel to the motion
motion vector d MVF and the motion vector e MVB are 25 vector c. The motion compensation image data is obtained by
respectively calculated by Equation 1 and Equation 2, where averaging the forward and backward reference picture data
MVF and MVB represent horizontal and vertical components obtained based on these motion vectors d and e.
of the motion vectors respectively. Note that the temporal In this case where the forward motion vector dis MVF, the
distance between the pictures can be determined based on the backward motion vectore is MVB, the motion vector cis MV.
information indicating the display order (position) given to 30 the temporal distance between the backward reference pic
respective pictures or the difference specified by the informa ture B7 for the current picture B6 and the picture P5 which the
tion. blockb in the backward reference picture B7 refers to is TRD,
The motion compensation image data generated in this and the temporal distance between the current picture B6 and
manner is outputted to the addition unit 1408. The motion the forward reference picture P5 is TRF respectively, the
compensation decoding unit 1405 stores the motion vector 35 motion vector d MVF and the motion vector e MVB are
information in the motion vector storage unit 1406. respectively calculated by Equation 1 and Equation 2. Note
The addition unit 1408 adds the inputted residual error that the temporal distance between pictures may be deter
image data and the motion compensation image data togen mined based on the information indicating the display order
erate decoded image data. The generated decoded image data (position) of the pictures or the difference specified by the
is outputted to the frame memory 1407 via the switch 1410. 40 information. Or, as the values of TRD and TRF, predeter
That is the completion of decoding one macroblock in the mined values for respective pictures may be used. These
picture B7. According to the same processing, the remaining predetermined values may be described in the bit stream as
macroblocks in the picture B7 are decoded in sequence. And header information.
after all the macroblocks of the picture B7 are decoded, the The second example of the case where the direct mode is
picture B6 is decoded. 45 selected as a coding mode will be explained with reference to
(Decoding of Picture B6) FIG. 7B again.
Since the operations of the bit stream analysis unit 1401, In this example, the motion vector which has been used for
the mode decoding unit 1403 and the residual error decoding decoding the block b in the picture B7 is utilized. The picture
unit 1402 until generation of residual error image data are B7 is the backward reference picture for the current picture
same as those for decoding the picture P9, the explanation 50 B6, and the block b is co-located with the block a in the
thereof will be omitted. picture B6. Here, it is assumed that the blockb has been coded
The motion compensation decoding unit 1405 generates in direct mode and the motion vector chas been substantially
motion compensation image data based on the inputted used as a forward motion vector for that coding. The motion
motion vector information and the like. The picture B6 has vector c stored in the motion vector storage unit 1406 may be
been coded with reference to the picture P5 as a forward 55 used, or it is calculated by reading out from the motion vector
reference picture and the picture B7 as a backward reference storage unit 1406 the motion vector of the picture P9 which
picture, and these pictures P5 and B7 have been already has been used for coding the blockb in direct mode, and then
decoded and stored in the frame memory 1407. Scaling that motion vector. Note that when storing motion
If inter picture bi-prediction coding is selected as a coding vectors in the motion vector storage unit 1406, the motion
mode, the motion compensation decoding unit 1405 obtains 60 compensation decoding unit 1405 needs to store only the
the forward reference picture data from the frame memory forward motion vector out of the two motion vectors obtained
1407 based on the forward motion vector information. It also by scaling for decoding the blockb in the picture B7 in direct
obtains the backward reference picture data from the frame mode.
memory 1407 based on the backward motion vector informa In this case, for the blocka, the motion vector d which is
tion. Then, the motion compensation decoding unit 1405 65 generated utilizing the motion vector c and refers to the pic
averages the forward and backward reference picture data to ture P5 is used as a forward motion vector, and the motion
generate motion compensation image data. vectore which is generated utilizing the motion vector c and
US 8,718,141 B2
27 28
refers to the picture B7 is used as a backward motion vector. vectorg MVF and the motion vector h MVB are respectively
For example, as a method of utilizing the motion vector c, calculated by Equation 1 and Equation 2.
there is a method of generating motion vectors parallel to the Furthermore, the fifth example of the case where the direct
motion vector c. The motion compensation image data is mode is selected as a coding mode will be explained with
obtained by averaging the forward and backward reference reference to FIG. 8A again. Here, it is assumed that a block a
picture data obtained based on these motion vectors d and e. in the picture B6 is to be decoded in direct mode. In this
In this case, the motion vector d MVF and the motion example, the motion vector is set to zero “0”, and motion
vector e MVB are respectively calculated by Equation 1 and compensation is performed by bi-predictive reference using
Equation 2, as is the case of the first example of the direct the picture P5 as a forward reference picture and the picture
mode. 10 B7 as a backward reference picture.
Next, the third example of the case where the direct mode Next, the sixth example of the case where the direct mode
is selected as a coding mode will be explained with reference is selected as a coding mode will be explained with reference
to FIG.7C again. to FIG. 8B again. Here, it is assumed that a block a in the
In this example, it is assumed that the blocka in the picture picture B6 is to be decoded in direct mode. In this example,
B6 is to be decoded, and the block b in the picture B7 is 15 the motion vector g which has been used for decoding the
co-located with the block a. The block b has been coded by blockfin the P-picture P9 is utilized. The picture P9 is located
backward reference prediction, and the backward motion later than the current picture B6, and the block f is co-located
vector of the block b is a motion vectorf, which refers to the with the blocka. The motion vector g is stored in the motion
picture P9. In this case, for the blocka, the motion vector g vector storage unit 1406. The blocka is bi-predicted from the
which is obtained utilizing the motion vector fand refers to forward reference picture P5 and the backward reference
the picture P5 is used as a forward motion vector, and the picture B7 using the motion vectors which are obtained uti
motion vector h which is obtained utilizing the motion vector lizing the motion vector g. For example, if a method of gen
fand refers to the picture B7 is used as a backward motion erating motion vectors parallel to the motion vectorg is used,
vector. For example, as a method of utilizing the motion as is the case of the above-mentioned first example, the
vectorf, there is a method of generating motion vectors par 25 motion vector h and the motion vector i are used for the
allel to the motion vectorf. The motion compensation image picture P5 and the picture B7 respectively for obtaining the
data is obtained by averaging the forward and backward ref motion compensation image data of the block a.
erence picture data obtained based on these motion vectors g In this case where the forward motion vector his MVF, the
and h. backward motion vectori is MVB, the motion vectorg is MV.
In this case where the forward motion vector g is MVF, the 30 the temporal distance between the picture P9 located later
backward motion vector his MVB, the motion vectorf is MV. than the current picture B6 and the picture P5 which the block
the temporal distance between the backward reference pic f in the picture P9 refers to is TRD, the temporal distance
ture B7 for the current picture B6 and the picture P9 which the between the current picture B6 and the forward reference
blockb in the backward reference picture B7 refers to is TRD, picture P5 is TRF, and the temporal distance between the
the temporal distance between the current picture B6 and the 35 current picture B6 and the backward reference picture B7 is
forward reference picture P5 is TRF, and the temporal dis TRB respectively, the motion vector h MVF and the motion
tance between the current picture B6 and the backward ref vector i MVB are respectively calculated by Equation 1 and
erence picture B7 is TRB respectively, the motion vector g Equation 5.
MVF and the motion vector h MVB are respectively calcu Next, the seventh example of the case where the direct
lated by Equation 3 and Equation 4. 40 mode is selected as a coding mode will be explained with
Next, the fourth example of the case where the direct mode reference to FIG. 8C again. Here, it is assumed that a block a
is selected as a coding mode will be explained with reference in the picture B6 is decoded in direct mode. In this example,
to FIG. 7D again. the assignment of relative indices to the above-mentioned
In this example, it is assumed that the blocka in the picture picture numbers is changed (remapped) and the picture P9 is
B6 is to be decoded, and the block b in the picture B7 is 45 the backward reference picture. In this case, the motion vector
co-located with the block a. The block b has been coded by g which has been used for coding the block fin the picture P9
backward reference prediction as is the case of the third is utilized. The picture P9 is the backward reference picture
example, and the backward motion vector of the block b is a for the picture B6, and the blockfis co-located with the block
motion vectorf, which refers to the picture P9. In this case, the a in the picture B6. The motion vectorg is stored in the motion
motion vectorg which is obtained utilizing the motion vector 50 vector storage unit 1406. The blocka is bi-predicted from the
f and refers to the picture P9 is used as a forward motion forward reference picture P5 and the backward reference
vector, and the motion vector h which is obtained utilizing the picture P9 using motion vectors generated utilizing the
motion vector f and refers to the picture B7 is used as a motion vector g. For example, if a method of generating
backward motion vector. For example, as a method of utiliz motion vectors parallel to the motion vectorg is used, as is the
ing the motion vector f. there is a method of generating 55 case of the above-mentioned first example, the motion vector
motion vectors parallel to the motion vector f. The motion hand the motion vector i are used for the picture P5 and the
compensation image data is obtained by averaging the for picture P9 respectively for obtaining the motion compensa
ward and backward reference picture data obtained based on tion image data of the block a.
these motion vectors g and h. In this case, where the forward motion vector his MVF, the
In this case where the forward motion vector g is MVF, the 60 backward motion vectori is MVB, the motion vectorg is MV.
backward motion vector his MVB, the motion vectorf is MV. the temporal distance between the backward reference pic
the temporal distance between the backward reference pic ture P9 for the current picture B6 and the picture P5 which the
ture B7 for the current picture B6 and the picture P9 which the block fin the picture P9 refers to is TRD, and the temporal
blockb in the backward reference picture B7 refers to is TRD, distance between the current picture B6 and the forward
and the temporal distance between the current picture B6 and 65 reference picture P5 is TRF respectively, the motion vector h
the reference picture P9 which the block b in the backward MVF and the motion vector i MVB are respectively calcu
reference picture B7 refers to is TRF respectively, the motion lated by Equation 1 and Equation 2.
US 8,718,141 B2
29 30
The motion compensation image data generated as above is MVF and the motion vector e MVB are respectively calcu
outputted to the addition unit 1408. The addition unit 1408 lated by Equation 1 and Equation 5.
adds the inputted residual error image data and the motion The motion compensation image data generated in this
compensation image data to generate decoded image data. manner is outputted to the addition unit 1408. The addition
The generated decoded image data is outputted to the frame unit 1408 adds the inputted residual error image data and the
memory 1407 via the switch 1410. motion compensation image data to generate decoded image
That is the completion of decoding one macroblock in the data. The generated decoded image data is outputted to the
picture B6. According to the same processing, the remaining frame memory 1407 via the switch 1410.
macroblocks in the picture B6 are decoded in sequence. And That is the completion of decoding one macroblock in the
after all the macroblocks in the picture B6 are decoded, the 10 picture B8. According to the same processing, the remaining
picture B8 is decoded. macroblocks in the picture B8 are decoded in sequence. The
(Decoding of Picture B8) other pictures are decoded depending on their picture types
Since the operations of the bit stream analysis unit 1401, according to the above-mentioned decoding procedures.
the mode decoding unit 1403 and the residual error decoding Next, the frame memory control unit 1404 reorders the
unit 1402 until generation of residual error image data are 15 picture data of the pictures stored in the frame memory 1407
same as those for decoding the picture P9, the explanation in time order as shown in FIG. 6A for outputting as output
thereof will be omitted. pictures.
The motion compensation decoding unit 1405 generates As described above, according to the moving picture
motion compensation image data based on the inputted decoding method of the present invention, a B-picture which
motion vector information and the like. The picture B8 has has been coded by interpicture bi-prediction is decoded using
been coded with reference to the picture B7 as a forward previously decoded pictures which are located close in dis
reference picture and the picture P9 as a backward reference play order as forward and backward reference pictures.
picture, and these pictures B7 and P9 have been already When the direct mode is selected as a coding mode, refer
decoded and stored in the frame memory 1407. ence image data is obtained from previously decoded image
If inter picture bi-prediction coding is selected as a coding 25 data to obtain motion compensation image data, with refer
mode, the motion compensation decoding unit 1405 obtains ence to a motion vector of a previously decoded backward
the forward reference image data from the frame memory reference picture stored in the motion vector storage unit
1407 based on the forward motion vector information. It also 1406.
obtains the backward reference image data from the frame According to this operation, when a B-picture has been
memory 1407 based on the backward motion vector informa 30 coded by inter picture bi-prediction using pictures which are
tion. Then, the motion compensation decoding unit 1405 located close in display order as forward and backward ref
averages the forward and backward reference image data to erence pictures, the bit stream generated as a result of such
generate motion compensation image data. coding can be properly decoded.
When direct mode is selected as a coding mode, the motion In the present embodiment, seven examples of the direct
compensation decoding unit 1405 obtains the motion vector 35 mode have been explained. However, one method, which is
of the picture P9 stored in the motion vector storage unit 1406. uniquely determined for every macroblock or block based on
Using this motion vector, the motion compensation decoding the decoding method of a co-located block in a backward
unit 1405 obtains the forward and backward reference image reference picture, may be used, or a plurality of different
data from the frame memory 1407. Then, the motion com methods may be used for every macroblock or block by
pensation decoding unit 1405 averages the forward and back 40 switching them. When a plurality of methods are used, the
ward reference picture data to generate motion compensation macroblock or the block is decoded using information
image data. described in a bit stream, indicating which type of direct
The case where the direct mode is selected as a coding mode has been used. For that purpose, the operation of the
mode will be explained with reference to FIG. 8D again. motion compensation decoding unit 1405 depends upon the
Here, it is assumed that a block a in the picture B8 is to be 45 information. For example, when this information is added for
decoded and a blockb in the backward reference picture P9 is every block of motion compensation, the mode decoding unit
co-located with the blocka. The forward motion vector of the 1403 determines which type of direct mode is used for coding
block b is the motion vector c, which refers to the picture P5. and delivers it to the motion compensation decoding unit
In this case, the motion vector d which is generated utilizing 1405. The motion compensation decoding unit 1405 per
the motion vector c and refers to the picture B7 is used as a 50 forms decoding processing using the decoding method as
forward motion vector, and the motion vector e which is explained in the present embodiment depending upon the
generated utilizing the motion vector c and refers to the pic delivered type of direct mode.
ture P9 is used as a backward motion vector. For example, as Also, in the present embodiment, the picture structure
a method of utilizing the motion vector c, there is a method of where three B-pictures are located between I-pictures and
generating motion vectors parallel to the motion vectoric. The 55 P-pictures has been explained, but any other number, four or
motion compensation image data is obtained by averaging the five, for instance, of B-pictures may be located.
forward and backward reference image data obtained based In addition, in the present embodiment, the explanation has
on these motion vectors d and e. been made on the assumption that a P-picture is coded with
In this case where the forward motion vector dis MVF, the reference to one previously coded I or P-picture which is
backward motion vectore is MVB, the motion vector cis MV. 60 located earlier or later than the current P-picture in display
the temporal distance between the backward reference pic order, a B-picture is coded with reference to two previously
ture P9 for the current picture B8 and the picture P5 which the coded neighboring pictures which are located earlier or later
blockb in the backward reference picture P9 refers to is TRD, than the current B-picture in display order, and the bit stream
the temporal distance between the current picture B8 and the generated as a result of this coding is decoded. However, in
forward reference picture B7 is TRF, and the temporal dis 65 the case of a P-picture, the P-picture may be coded with
tance between the current picture B8 and the backward ref reference to at most one picture for each block from among a
erence picture P9 is TRB respectively, the motion vector d plurality of previously coded I or P pictures which are located
US 8,718,141 B2
31 32
temporally earlier or later in display order as candidate ref an Internet service provider ex102, a telephone network
erence pictures, and in the case of a B-picture, the B-picture ex104 and base stations ex107-ex110.
may be coded with reference to at most two pictures for each However, the content supply system ex100 is not limited to
block from among a plurality of previously coded neighbor the configuration as shown in FIG. 18, and a combination of
ing pictures which are located temporally earlier or later in 5 any of them may be connected. Also, each device may be
display order as candidate reference pictures. connected directly to the telephone network ex104, not
Furthermore, when storing motion vectors in the motion through the base stations ex107-ex110.
vector Storage unit 1406, the motion compensation decoding The camera eX113 is a device Such as a digital video camera
unit 1405 may store both forward and backward motion vec capable of shooting moving pictures. The mobile phone may
tors, or store only the forward motion vector, ifa current block 10 be a mobile phone of a PDC (Personal Digital Communica
is coded by bi-predictive reference or in direct mode. If only tions) system, a CDMA (Code Division Multiple Access)
the forward motion vector is stored, the memory volume of system, a W-CDMA (Wideband-Code Division Multiple
the motion vector storage unit 1406 can be reduced. Access) system or a GSM (Global System for Mobile Com
munications) system, a PHS (Personal Handyphone system)
Third Embodiment 15 or the like.
A streaming server ex103 is connected to the camera ex 113
If a program for realizing the structures of the moving via the base station ex109 and the telephone network ex104,
picture coding method or the moving picture decoding which enables live distribution or the like using the camera
method as shown in the above embodiments is recorded on a ex113 based on the coded data transmitted from a user. Either
memory medium such as a flexible disk, it becomes possible the camera ex113 or the server for transmitting the data may
to perform the processing as shown in these embodiments code the data. Also, the moving picture data shot by a camera
easily in an independent computer system. ex116 may be transmitted to the streaming server ex103 via
FIG. 17 is an illustration showing the case where the pro the computer ex111. The camera ex116 is a device such as a
cessing is performed in a computer system using a flexible digital camera capable of shooting still and moving pictures.
disk which stores the moving picture coding method or the 25 Either the camera ex 116 or the computer ex111 may code the
moving picture decoding method of the above embodiments. moving picture data. An LSI ex117 included in the computer
FIG. 17B shows a front view and a cross-sectional view of eX111 or the camera eX116 actually performs coding process
an appearance of a flexible disk, and the flexible disk itself, ing. Software for coding and decoding moving pictures may
and FIG. 17A shows an example of a physical format of a be integrated into any type of storage medium (such as a
flexible disk as a recording medium body. The flexible disk 30 CD-ROM, a flexible disk and a hard disk) that is a recording
FD is contained in a case F, and a plurality of tracks Tr are medium which is readable by the computer ex111 or the like.
formed concentrically on the surface of the disk in the radius Furthermore, a camera-equipped mobile phone ex115 may
direction from the periphery and each track is divided into 16 transmit the moving picture data. This moving picture data is
sectors Se in the angular direction. Therefore, as for the the data coded by the LSI included in the mobile phone ex115.
flexible disk storing the above-mentioned program, the mov 35 The content Supply system ex100 codes contents (such as a
ing picture coding method as the program is recorded in an music live video) shot by users using the camera ex113, the
area allocated for it on the flexible disk FD. camera ex116 or the like in the same manner as the above
FIG. 17C shows the structure for recording and reproduc embodiment and transmits them to the streaming server
ing the program on and from the flexible disk FD. When the ex103, while the streaming server ex103 makes stream dis
program is recorded on the flexible disk FD, the moving 40 tribution of the content data to the clients at their request. The
picture coding method or the moving picture decoding clients include the computer ex111, the PDA ex112, the cam
method as a program is written in the flexible disk from the era ex113, the mobile phone ex114 and so on capable of
computer system Cs via a flexible disk drive. When the mov decoding the above-mentioned coded data. In the content
ing picture coding method is constructed in the computer supply system ex100, the clients can thus receive and repro
system by the program on the flexible disk, the program is 45 duce the coded data, and further can receive, decode and
read out from the flexible disk drive and transferred to the reproduce the data in real time so as to realize personal broad
computer system. casting.
The above explanation is made on the assumption that a When each device in this system performs coding or
recording medium is a flexible disk, but the same processing decoding, the moving picture coding apparatus or the moving
can also be performed using an optical disk. In addition, the 50 picture decoding apparatus, as shown in the above-mentioned
recording medium is not limited to a flexible disk and an embodiment, can be used.
optical disk, but any other medium such as an IC card and a A mobile phone will be explained as an example of the
ROM cassette capable of recording a program can be used. device.
Following is the explanation of the applications of the FIG. 19 is a diagram showing the mobile phone ex115
moving picture coding method and the moving picture decod 55 using the moving picture coding method and the moving
ing method as shown in the above embodiments, and the picture decoding method explained in the above embodi
system using them. ments. The mobile phone ex115 has an antenna ex201 for
FIG. 18 is a block diagram showing the overall configura sending and receiving radio waves to and from the base sta
tion of a content supply system ex100 for realizing content tion ex110, a camera unit ex203 such as a CCD camera
distribution service. The area for providing communication 60 capable of shooting video and still pictures, a display unit
service is divided into cells of desired size, and base stations ex202 Such as a liquid crystal display for displaying the data
ex107-ex110 which are fixed wireless stations are placed in obtained by decoding video and the like shot by the camera
respective cells. unit ex203 and received by the antenna ex201, a body unit
In this content Supply system ex100, devices such as a including a set of operation keys ex204, a Voice output unit
computer ex111, a PDA (personal digital assistant) ex112, a 65 ex208 Such as a speaker for outputting Voices, a Voice input
camera eX113, a mobile phone ex114 and a camera-equipped unit 205 Such as a microphone for inputting Voices, a storage
mobile phone ex115 are connected to the Internet ex 101 via medium ex207 for storing coded or decoded data such as data
US 8,718,141 B2
33 34
of moving or still pictures shot by the camera, text data and The multiplex/demultiplex unit ex308 multiplexes the
data of moving or still pictures of received e-mails, and a slot coded picture data Supplied from the picture coding unit
unit ex206 for attaching the storage medium ex207 to the ex312 and the Voice data Supplied from the Voice processing
mobile phone ex115. The storage medium ex207 includes a unit ex305 by a predetermined method, the modem circuit
flash memory element, a kind of EEPROM (Electrically Eras unit ex306 performs spread spectrum processing of the mul
able and Programmable Read Only Memory) that is an elec tiplexed data obtained as a result of the multiplexing, and the
trically erasable and rewritable nonvolatile memory, in a plas send/receive circuit unit ex301 performs digital-to-analog
tic case Such as a SD card. conversion and frequency transform of the data for transmit
The mobile phone ex115 will be further explained with 10
ting via the antenna ex201.
reference to FIG. 20. In the mobile phone ex115, a main As for receiving data of a moving picture file which is
control unit ex311 for overall controlling the display unit linked to a Web page or the like in data communication mode,
ex202 and the body unit including operation keys ex204 is the modem circuit unit ex306 performs inverse spread spec
connected to a power Supply circuit unit ex310, an operation trum processing of the data received from the base station
input control unit ex304, a picture coding unit ex312, a cam 15 ex110 via the antenna ex201, and sends out the multiplexed
era interface unit ex303, an LCD (Liquid Crystal Display) data obtained as a result of the processing to the multiplex/
control unit ex302, a picture decoding unit ex309, a multi demultiplex unit ex308.
plex/demultiplex unit ex308, a record/reproduce unit ex307, a In order to decode the multiplexed data received via the
modem circuit unit ex306 and a voice processing unit ex305 antenna ex201, the multiplex/demultiplex unit ex308 sepa
to each other via a synchronous bus ex313. rates the multiplexed data into a bit stream of picture data and
When a call-end key or a power key is turned ON by a a bit stream of voice data, and Supplies the coded picture data
user's operation, the power supply circuit unit ex310 supplies to the picture decoding unit ex309 and the voice data to the
respective units with power from a battery pack so as to voice processing unit ex305 respectively via the synchronous
activate the camera-equipped digital mobile phone ex115 for bus ex313.
making it into a ready state. 25 Next, the picture decoding unit ex309, which includes the
In the mobile phone ex115, the voice processing unit ex305 moving picture decoding apparatus as explained in the
converts the Voice signals received by the Voice input unit present invention, decodes the bit stream of picture data by
ex205 in conversation mode into digital voice data under the the decoding method corresponding to the coding method as
control of the main control unit ex311 including a CPU, ROM shown in the above-mentioned embodiment to generate
and RAM, the modem circuit unit ex306 performs spread 30 reproduced moving picture data, and Supplies this data to the
spectrum processing of the digital Voice data, and the send/ display unit ex202 via the LCD control unit ex302, and thus
receive circuit unit ex301 performs digital-to-analog conver moving picture data included in a moving picture file linked
sion and frequency transform of the data, so as to transmit it to a Web page, for instance, is displayed. At the same time, the
via the antenna ex201. Also, in the mobile phone ex115, after voice processing unit ex305 converts the voice data into ana
the data received by the antenna ex201 in conversation mode 35 log voice data, and Supplies this data to the Voice output unit
is amplified and performed of frequency transform and ana ex208, and thus voice data included in a moving picture file
log-to-digital conversion, the modem circuit unit ex306 per linked to a Web page, for instance, is reproduced.
forms inverse spread spectrum processing of the data, and the The present invention is not limited to the above-men
Voice processing unit ex305 converts it into analog Voice data, tioned system, and at least either the moving picture coding
so as to output it via the voice output unit 208. 40 apparatus or the moving picture decoding apparatus in the
Furthermore, when transmitting e-mail in data communi above-mentioned embodiment can be incorporated into a
cation mode, the text data of the e-mail inputted by operating digital broadcasting system as shown in FIG. 21. Such
the operation keys ex204 on the body unit is sent out to the ground-based or satellite digital broadcasting has been in the
main control unit ex311 via the operation input control unit news lately. More specifically, a bit stream of video informa
ex304. In the main control unit ex311, after the modem circuit 45 tion is transmitted from abroadcast station ex409 to or com
unit ex306 performs spread spectrum processing of the text municated with a broadcast satellite ex410 via radio waves.
data and the send/receive circuit unit ex301 performs digital Upon receipt of it, the broadcast satellite ex410 transmits
to-analog conversion and frequency transform for it, the data radio waves for broadcasting, a home-use antenna ex406 with
is transmitted to the base station ex110 via the antenna ex201. a satellite broadcast reception function receives the radio
When picture data is transmitted in data communication 50 waves, and a television (receiver) ex401 or a set top box
mode, the picture data shot by the camera unit ex203 is (STB) ex407 decodes the bit stream for reproduction. The
Supplied to the picture coding unit ex312 via the camera moving picture decoding apparatus as shown in the above
interface unit ex303. When it is not transmitted, it is also mentioned embodiment can be implemented in the reproduc
possible to display the picture data shot by the camera unit tion device ex403 for reading off and decoding the bit stream
ex203 directly on the display unit 202 via the camera interface 55 recorded on a storage medium ex402 that is a recording
unit ex303 and the LCD control unit ex302. medium such as a CD and DVD. In this case, the reproduced
The picture coding unit ex312, which includes the moving Video signals are displayed on a monitor ex404. It is also
picture coding apparatus as explained in the present inven conceived to implement the moving picture decoding appa
tion, compresses and codes the picture data Supplied from the ratus in the set top box ex407 connected to a cable ex405 for
camera unit ex203 by the coding method used for the moving 60 a cable television or the antenna ex406 for satellite and/or
picture coding apparatus as shown in the above embodiment ground-based broadcasting so as to reproduce them on a
So as to transform it into coded picture data, and sends it out monitor ex408 of the television ex401. The moving picture
to the multiplex/demultiplex unit ex308. At this time, the decoding apparatus may be incorporated into the television,
mobile phone ex115 sends out the voices received by the not in the set top box. Or, a carex412 having an antenna ex411
voice input unit ex205 during shooting by the camera unit 65 can receive signals from the satellite ex410 or the base station
ex203 to the multiplex/demultiplex unit ex308 as digital eX107 for reproducing moving pictures on a display device
voice data via the voice processing unit ex305. Such as a car navigation system ex413.
US 8,718,141 B2
35 36
Furthermore, the moving picture coding apparatus as second motion vector, there is no need to add motion vector
shown in the above-mentioned embodiment can code picture information to a bit stream and prediction efficiency can be
signals for recording on a recording medium. As a concrete improved.
example, there is a recorder ex420 such as a DVD recorder for In addition, according to the moving picture decoding
recording picture signals on a DVD disc ex421 and a disk method of the present invention, a bit stream, which is gen
recorder for recording them on a hard disk. They can be erated as a result of inter picture bi-prediction coding using
recorded on an SD card ex422. If the recorder ex420 includes pictures which are located temporally close in display order
the moving picture decoding apparatus as shown in the above as first and second reference pictures, can be properly
decoded.
mentioned embodiment, the picture signals recorded on the 10
DVD disc ex421 or the SD card ex422 can be reproduced for INDUSTRIAL APPLICABILITY
display on the monitor ex408.
As the structure of the car navigation system ex413, the As described above, the moving picture coding method and
structure without the camera unit ex203, the camera interface the moving picture decoding method according to the present
unit ex303 and the picture coding unit ex312, out of the units 15 invention are useful as a method for coding picture data
shown in FIG. 20, is conceivable. The same goes for the corresponding to pictures that form a moving picture to gen
computer ex111, the television (receiver) ex401 and others. erate a bit stream, and a method for decoding the generated bit
In addition, three types of implementations can be con stream, using a mobile phone, a DVD apparatus and a per
ceived for a terminal such as the above-mentioned mobile Sonal computer, for instance.
phone ex114; a sending/receiving terminal including both an The invention claimed is:
encoder and a decoder, a sending terminal including an 1. An integrated circuit comprising:
encoder only, and a receiving terminal including a decoder an audio processing unit operable to process audio data;
only. and
As described above, it is possible to use the moving picture a picture coding unit operable to code picture data,
coding method or the moving picture decoding method in the 25 wherein said picture coding unit includes:
above-mentioned embodiments in any of the above-men a coding unit operable to determine two motion vectors for
tioned apparatus and system, and using this method, the a current block to be coded, based on one motion vector
effects described in the above embodiments can be obtained. of a co-located block which is a block included within a
Furthermore, the present invention is not limited to the previously coded B-picture and co-located with the cur
above embodiments, but may be varied or modified in many 30 rent block, and code the current block by performing
ways without any departure from the Scope of the present motion compensation on the current block in direct
invention. mode using the two motion vectors for the current block
As described above, according to the moving picture cod and two reference pictures which correspond to the two
ing method of the present invention, B-pictures can be coded motion vectors for the current block,
using pictures which are temporally close in display order as 35 wherein said coding unit is operable to:
reference pictures. Accordingly, prediction efficiency for when the co-located block has been coded using two
motion compensation is improved and thus coding efficiency motion vectors and two reference pictures which respec
is improved. tively correspond to the two motion vectors of the co
In direct mode, by Scaling a first motion vector of a second located block,
reference picture, there is no need to transmit motion vector 40 specify, as a specified motion vector, one of the two motion
information and thus prediction efficiency can be improved. vectors of the co-located block;
Similarly, in direct mode, by Scaling a first motion vector generate the two motion vectors for the current block that
substantially used for the direct mode coding of the second are used for performing direct mode motion compensa
reference picture, there is no need to transmit motion vector tion on the current block, by Scaling the specified motion
information, and prediction efficiency can be improved even 45 vector by a ratio of a first difference and a second dif
if a co-located block in the second reference picture is coded ference, the first difference being a difference between
in direct mode. display order information of a reference picture corre
Also, in direct mode, by Scaling a second motion vector sponding to the specified motion vector and display
which has been used for coding a co-located blockina second order information of the previously coded B-picture
reference picture, there is no need to transmit motion vector 50 including the co-located block, and the second differ
information, and prediction efficiency can be improved even ence being a difference between the display order infor
if the co-located block in the second reference picture has mation of the reference picture corresponding to the
only a second motion vector. specified motion vector and display order information of
Furthermore, in direct mode, by setting forcedly a motion the current picture including the current block; and
vector in direct mode to be “0”, when the direct mode is 55 code the current block by performing motion compensa
selected, there is no need to transmit motion vector informa tion on the current block in direct mode using the gen
tion nor to Scale the motion vector, and thus processing Vol erated two motion vectors for the current block and two
ume can be reduced. reference pictures, the two reference pictures being the
Also, in direct mode, by Scaling a motion vector of a later reference picture corresponding to the specified motion
P-picture, there is no need to store a motion vector of a second 60 vector and the previously coded B-picture including the
reference picture when the second reference picture is a co-located block, and the two reference pictures respec
B-picture. And, there is no need to transmit the motion vector tively corresponding to the generated two motion vec
information, and prediction efficiency can be improved. tors, and
Furthermore, in direct mode, since a first motion vector is wherein, when specifying the one of the two motion vec
scaledifa second reference picture has the first motion vector, 65 tors of the co-located block as the specified motion vec
and a second motion vector is scaled if the second reference tor, said coding unit is operable to specify only a forward
picture does not have the first motion vector but only the motion vector as the specified motion vector, from
US 8,718,141 B2
37 38
among the two motion vectors including the forward vector by a ratio of a first difference and a second dif
motion vector and a backward motion vector. ference, the first difference being a difference between
2. A mobile terminal comprising the integrated circuit display order information of a reference picture corre
according to claim 1. sponding to the specified motion vector and display
3. A coding apparatus comprising: order information of the previously coded B-picture
an audio processing unit operable to process audio data to including the co-located block, and the second differ
be processed; ence being a difference between the display order infor
a picture coding unit operable to code picture data to be mation of the reference picture corresponding to the
coded; and specified motion vector and display order information of
a power supply circuit configured to supply said coding 10 the current picture including the current block; and
apparatus with power, code the current block by performing motion compensa
wherein said picture coding unit includes: tion on the current block in direct mode using the gen
a coding unit operable to determine two motion vectors for erated two motion vectors for the current block and two
a current block to be coded, based on one motion vector reference pictures, the two reference pictures being the
of a co-located block which is a block included within a 15 reference picture corresponding to the specified motion
previously coded B-picture and co-located with the cur Vector and the previously coded B-picture including the
rent block, and code the current block by performing co-located block, and the two reference pictures respec
motion compensation on the current block in direct tively corresponding to the generated two motion vec
mode using the two motion vectors for the current block tors, and
and two reference pictures which correspond to the two wherein, when specifying the one of the two motion vec
motion vectors for the current block, tors of the co-located block as the specified motion vec
wherein said coding unit is operable to: tor, said coding unit is operable to specify only a forward
when the co-located block has been coded using two motion vector as the specified motion vector, from
motion vectors and two reference pictures which respec among the two motion vectors including the forward
tively correspond to the two motion vectors of the co 25 motion vector and a backward motion vector.
located block, 4. The coding apparatus according to claim3, further com
specify, as a specified motion vector, one of the two motion prising:
vectors of the co-located block; a microphone that inputs the audio data to be processed;
and
generate the two motion vectors for the current block that 30 a camera that inputs the picture data to be coded.
are used for performing direct mode motion compensa
tion on the current block, by scaling the specified motion

You might also like