Introduction To Videoconferencing

Introduction to Videoconferencing
Author: Paul Down Version 2.1 - Release
Contents
1. 2. 3. 4. 5. 6. 7. 8. Overview ................................................................................................................ 2 Introduction............................................................................................................. 2 The Basics............................................................................................................... 3 What is Videoconferencing used for?..................................................................... 3 Advantages ............................................................................................................. 7 Disadvantages ......................................................................................................... 7 The Conference Environment ................................................................................. 8 Videoconferencing Equipment Types and Options ............................................. 8 8.1. 8.2. 8.3. 9. 9.1. 9.2. 9.3. 9.4. 10. 10.1. 10.2. 10.3. 10.4. 11. 12. The Television Camera and Lens .................................................................... 9 Microphones .................................................................................................... 9 The CODEC .................................................................................................... 9 The Internet ................................................................................................... 11 Integrated Services Digital Network (ISDN) ................................................ 12 IP and ISDN CODECs .................................................................................. 12 Broadband Networks ..................................................................................... 12 Implementing Videoconferencing - the Options ............................................... 13 Point to Point and Multipoint .................................................................... 13 Voice Switched Conferences ..................................................................... 13 Chairman Control ...................................................................................... 13 Continual Presence .................................................................................... 13
The Network ......................................................................................................... 11
Data Conferencing ............................................................................................ 13 Taking Part in a Videoconference - Advice for Participants ............................ 14 Dressing for the Camera and CODEC ....................................................... 14 Delays in the Sound (Latency) ................................................................... 15 Echo Cancellers ......................................................................................... 15 Conferencing with Delayed Sound ............................................................ 15
12.1. 12.2. 12.3. 12.4. 13. 14.
Conclusion ........................................................................................................ 16 Acknowledgement ............................................................................................ 16
Appendix A - Television camera lens sensitivity ........................................................ 17 Appendix B - Analogue and digital signals ................................................................. 17 Appendix C - Codec audio delays................................................................................ 23 Appendix D Useful Information Sites ...................................................................... 24
Paul Down 2009
Introduction to Videoconferencing 1. Overview
This document, Introduction to Videoconferencing, aims to achieve three objectives: 1. Act as a basic introduction to videoconferencing for potential users, particularly those working in schools. 2. Explain some of the terminology that is peculiar to videoconferencing. 3. Supply links to other relevant information that should be useful to beginners and also more experienced users. If you are a new visitor to this website, the depth and breadth of information may be both intimidating and confusing. Hopefully, this paper will dispel some of these concerns so that the more detailed Video Technology Advisory Service (VTAS) documents may then be viewed with more confidence. JANET is working in parallel with Becta (formerly known as the British Educational Communications and Technology Agency) to implement a National Schools Videoconferencing Network and it is recommended that prospective users also search videoconferencing on the Becta site where a wealth of information is also available on the uses of the medium.
2.
Introduction
What is videoconferencing? Videoconferencing may be described as a method of conferencing between two or more locations where both sound and vision are transmitted and received so as to enable simultaneous interactive communication. Due to its cost it was originally only used by multi-national companies to link worldwide sites. However, as the technology has improved and costs have fallen dramatically, it is now used extensively in education and commerce. Videoconferencing can save significant amounts of money in terms of both travel costs and time it can also open up new methods of communication e.g. linking several schools together to enhance the learning experience. Videoconferencing is certainly growing very rapidly, and can save a great deal of money, but is it a new technique? No, videoconferencing has been with us for decades. Multinational corporations, for example, have been routinely using it since the 1980s. Most television viewers would have experienced videoconferencing as a passive observer without realising it. The television broadcasters have routinely used two way interviews (for sports coverage etc) from as early as the 1960s, where a link presenter in the studio interacts with a football manager; for example, at a remote football ground. These two are actively videoconferencing, but are not normally considered in this light. This type of conferencing is achieved at a high technical standard with broadcast quality sound and pictures. So why has videoconferencing become topical and much more popular since the 1990s? The main reasons are the advances in technology and equipment, making these now available at a reasonable cost. Both of the examples outlined above historically used dedicated methods of connection between the sites. Dedicated connections, whether by landline or satellite link, provide high quality but are very expensive so the need had to be overwhelming to justify the cost. Paul Down 2009 2
The introduction of dial up digital connections by the telephone companies and access to the Internet coupled with parallel advances in technology have now made reasonable quality conferences to a remote site affordable to a much wider group of users. At an even lower quality and perhaps not really applicable to education, twoway videoconferencing is fairly common between mobile phone users. The Microsoft Network (MSN) is another platform which enables this low end videoconferencing.
3.
The Basics
Videoconferencing can conveniently be broken down into 3 components: 1. The conference environment, i.e. the classroom or conference room 2. The conference equipment, that converts the images and speech of the participants into a format that enables transmission to a remote site over the network 3. The conference network that links the sites together All videoconferences must have these three basic elements, however simple or complex; the conference can always be synthesised into environment, equipment and network. Conferencing is not limited only to talking heads, i.e. the participants; slides from a PowerPoint presentation running on a PC may be introduced to illustrate a particular point, or perhaps moving sequences from a PC application or DVD player. It is even possible for a PC application running at one site to be shared with another site so that either site can annotate and take control of the software. When PC documents are shared in this way it is termed Data Sharing.
4.
What is Videoconferencing used for?
Meetings: this is probably the most popular application. The cost savings can be appreciable especially for international conferences where not only the travel costs are saved but also the significant time spent travelling to destinations. A company with branches in Europe, the United States and the Far East can use videoconferencing to effectively communicate and run its business. Geographical distance is no longer a barrier. This travel replacement, though, can sometimes act as a disincentive for some people to videoconference, as despite the real costs and inconvenience of travel many actually enjoy the experience especially when they are able to cover the cost with expenses. Teaching: videoconferencing can open up new dimensions to teaching. With videoconferencing, the teacher no longer has to be in the same room as the students. A national or even an international subject specialist can be linked to any number of sites in the United Kingdom or even the world. The lack of a local expert need not prevent students following a particular course. Several sites may also collaborate to present a subject more comprehensively. For example, Monkseaton High School in North Tyneside regularly shares language teaching with other schools; they also conference with a local school as a member of the Partner Scheme to assist students who have physical and or communication difficulties.
Paul Down 2009
The ability to introduce illustrative material either as still images or moving sequences to both the local and remote audiences enhances the presentation. Videoconference sessions can take many forms including: 1. There is a now a range of software which facilitates one-to-one conferencing from PC desktops. The quality and available features vary widely. 2. A teacher by him or herself teaching remotely to a group of students. 3. A teacher in a classroom teaching his own and students in another school. 4. A teacher with both local and remote students but with the added dimension of full interaction from both student groups. In the second scenario, the teacher can conference from any convenient room suitably equipped, even an office (see Figure 1). They can select appropriate sources for their lesson i.e. camera, Whiteboard, PC, DVD player etc. as appropriate. The remote site will be able to interact and interject in a similar same way that would occur during a normal lesson even though the teacher may be hundreds of miles away. The teachers main concern would be to transmit decent images and accompanying sound. A link person would coordinate the students and equipment at the remote site.
Figure 1: A Teacher Videoconferencing to a Remote Site
Paul Down 2009
The third situation requires a different approach. If a local teaching session to a group of students is also simultaneously relayed by videoconferencing to a remote site then the room needs to be specially arranged. The teacher will now need to face the local students but also be seen clearly at the remote site. The room would require to be laid out in a similar way to Figure 2 below. The local students can view PC illustrations etc. on the local picture monitor.
Figure 2: Teaching Local Students and Relayed to a Remote Classroom
The teacher will also need to view their own image together with any illustrations that they are using while facing the local audience. These illustrations may originate from a PC, a DVD player or from a Whiteboard now in common usage throughout schools. A small television monitor termed a preview monitor is shown on the desk in Figure 2 and is used to view any of these local images prior and during transmission to the remote site. As the lesson is simply being relayed to the remote site, interaction is not expected so remote site pictures and sounds are not relevant. Paul Down 2009 5
Finally, where teaching is being conducted both locally and remotely and full interaction is needed, we have a more complex situation. Pictures and sound are now required from the remote site. The local students and the teacher will need to see and hear the remote students. A second television display facing the students will normally be provided but the teacher also requires a second television preview monitor on their desk facing them in order to view the remote students (see Figure 3).
Figure 3: Teaching both Local Students and Remote Students with Full Interaction
Television picture monitors have been illustrated above, but in many cases data or large screen projectors will be better for the student displays as the images can be much larger than those produced by television monitors and still be acceptably bright Paul Down 2009 6
under ambient lighting levels. However, care has to be taken with some data projectors as the noise from their cooling fans can be quite intrusive. In higher education, large scale teaching over videoconferencing is carried out to overcome geographical problems. The Welsh Video Network links most Universities and Colleges in Wales. In Scotland, the remote Highland and Island areas have been linked together to create The University of the Highlands and Islands so that students may link with partner institutions for collaborative teaching. Management: colleges with split campuses need to communicate regularly and efficiently, and videoconferencing is seen as a cost-effective solution. A good example is the Open University who conference between sites and also link remote sites to their regional centres. Remote Diagnosis: where specialist medical resources are scarce e.g. in a rural area, cottage hospitals and GP surgeries can be linked to the regional teaching hospital. Patients can then receive rapid specialist diagnosis. Interviews: this use is growing in popularity. It is found by many users to be just as effective as face-to-face interviews, particularly when data sharing is used to compare CVs etc. For international interviews, videoconferencing could include candidates that previously would have been excluded purely on grounds of cost. This is particularly relevant for local authorities. Some interviewers, however, regard the conference experience as quite inferior when compared to a face-to-face meeting and complain that the medium lacks spontaneity, and that nuances of body language are lost. Other interviewers use videoconferencing as a first stage filter in the selection process. They are thus able to view many more potential candidates due to the cost savings, especially where these applicants are based overseas. Legal Work: the nature of many legal cases can involve communication between parties in different locations, even different continents. Videoconferencing helps to reduce costs. It is also being used for live in-court sessions. In particular, sensitive cases e.g. those involving child abuse, intimidation and rape, videoconferencing can make the experience far less traumatic for the victims.
5.
Advantages
Convenience Cost savings for travel, accommodation and staff time Ability to link several sites simultaneously Access to remotely located experts In child abuse or other court proceedings, victims evidence provided via videoconferencing can reduce the potential of intimidation Having a set time for the meeting encourages more control and less time wasted on non-agenda items
6.
Disadvantages
The quality of the received images can be compromised by the technology On lower quality links, movement can be jerky
Paul Down 2009
Body language can be lost if movement is jerky and/or picture quality is reduced There may be a delay on the sound that participants need to get accustomed to Some believe that the atmosphere of a normal face-to-face meeting is lost Certainly some of these disadvantages may affect the experience, especially for the lower quality links. For meetings, it is said that if you already know the participants it is a distinct advantage. This is probably true for most meetings but may be especially important for videoconferences where the technical quality is compromised.
7.
The Conference Environment
School rooms that are effective for conventional teaching and face-to-face meetings may not be suitable for videoconferencing, but music rooms may provide a good starting point for a conferencing room as they could already possess some of the desirable characteristics. The provision of an effective environment for videoconferencing requires particular attention to the rooms characteristics; this is true both for sophisticated teaching sessions and simple desktop conferencing from a PC with plug in cards. The room needs to be quiet, as microphones can accentuate unwanted sounds. The human ear may be quite tolerant of ambient noise such as traffic, air-conditioners, students in corridors etc. but a microphone is not so discriminatory and the intrusive noise can effectively render a conference unworkable. Echoes within a room (or reverberation) must be controlled otherwise the conference sound will be degraded. This is particularly relevant in schools where some classrooms can suffer severely from echoes. Television cameras are sensitive to changes in light levels so windows pose a special problem, as light levels can fluctuate wildly. The overall lighting level also needs to be high to produce the best possible images from the camera. Decorations within the room need to be chosen carefully. Busy patterned backgrounds are not ideal; plain matt surfaces are to be preferred. All of these factors and several more are covered in greater detail in the VTAS document Videoconferencing Rooms. An acceptable environment is the starting point; without this, sound and vision quality will be impaired. Unfortunately many potential users of the medium have been put off by a bad experience in an unsuitable room. If the characteristics of the room are unsuitable, no amount of expensive equipment will correct the resulting poor quality of the pictures and sound.
8.
Videoconferencing Equipment Types and Options
In its most basic form, a videoconference involves transmission of the images and the voices of the participants to a remote site. Optional sources include still images and/or moving sequences from a video recorder, DVD or a PC. The basic conference requires: A television camera to capture images of the participants Paul Down 2009 8
A microphone to pick up their speech A means of transferring this sound and vision information to the remote location 8.1. The Television Camera and Lens See Appendix A. The camera converts light images into an electrical signal so that it may be displayed, recorded or transmitted. The camera lens focuses the images onto a light sensitive layer sometimes referred to as a Charge Coupled Device or CCD. The type of lens determines the angle or field of view. A wide-angle lens can capture a large group, a narrow angle lens a close up. The ubiquitous Web-Cam is a very low cost (and low quality) camera with a fixed focus lens designed only to provide an image of the PC operator. Lenses suitable for group conference applications will have a variable field of view these are known as Zoom lenses. Zoom lenses can zoom out to a wide-angle shot to include a group or zoom in to a close up shot of a single participant. The sensitivity of the lens is determined by the amount of light entering the lens; most cameras adjust sensitivity automatically but sometimes there is manual adjustment. Both methods move an iris-diaphragm in front of the lens to effectively increase or reduce the diameter (or aperture) of the entry pupil of the lens. Most cameras control the iris automatically to optimise image quality for the available light. It is important to realise, however, that in low light conditions, as the iris is opened up to increase the sensitivity there will be less depth of field. The resultant picture overall may not appear sharply focussed. Television cameras are capable of operating in very low light conditions but they produce much higher quality pictures at higher light levels. At high light levels the lens aperture will also be reduced to give the added advantage of a greater depth of field. In broadcast television this increase in depth of field may not necessarily be seen as an advantage. For artistic effect, directors may require the foreground to be in sharp focus with the background out of focus. This gives separation to the images. However, in videoconferencing the general rule is the more depth of field the better. 8.2. Microphones A microphone converts sound energy into an electrical signal to enable it to be recorded or transmitted to a remote location. It has already been mentioned that microphones also pick up unwanted sounds together with the participants voices. If these unwanted sounds are significant they will interfere or even completely obscure the desired parts of the sound. The correct choice of microphone and its position in the room are vital to achieve high quality sound. The conference environment will also influence sound quality, as has been mentioned above both low ambient noise levels and low levels of echo are essential requirements. 8.3. The CODEC The CODEC has two components: the COder and the DECoder. The COder takes the local sound and vision signals and converts them into a form that can be transmitted over a digital network. The DECoder performs the reverse function i.e. it takes the remote sites digital signals from the network and converts or decodes them into a form that enables the picture monitor to display images and the loudspeaker to radiate Paul Down 2009 9
sound from the remote site. A CODEC is thus required at each end of the link for a videoconference. CODECs can be packaged as plug in PC boards for personal desktop use, within purpose designed complete systems for group conferencing known as Rollabouts, or as compact Portable units that sit on top of picture monitors. From 8.1 above, the electrical signals generated by the camera may after suitable processing be recorded. They may also be displayed on a picture monitor or data display. Similarly from 8.2, microphone signals can be recorded or heard via a loudspeaker. It is also possible to transmit these electrical signals directly to another site. These electrical facsimiles of the sound and picture are termed analogue signals. See Appendix B for more details. Analogue television signals require a network with a huge capacity for transmitting the information, i.e. a very wide bandwidth. Analogue networks are capable of very high quality but can be very expensive to install. They also generally need a dedicated circuit i.e. only available between specific sites for a particular application. For a short link e.g. between rooms, dedicated circuits are cost effective as they can be achieved simply by a length of cable, but for anything further the costs can be prohibitive. Videoconferencing using analogue methods was established by multinational corporations in the 1980s, it was also employed by the broadcasters for sports interviews etc. The potential cost savings (multinationals) and the importance of the message (broadcasting) outweighed the high network costs involved. With advances in technology, analogue transmission has virtually ceased; digital transmission is now the norm for videoconferencing and both radio and television broadcasting. For videoconferencing to appeal to a wider audience it had to prove cost effective. This was solved by the arrival of digital networks that were easily accessible. Analogue and digital signals are very different. Human senses respond directly to analogue stimuli. A typical example of an analogue signal is shown in Figure 4, the sound wave generated by a vibrating tuning fork. A microphone close by will convert the sound variations into an electrical signal that mimics the original. The microphone signals vary both in level (amplitude) and frequency (pitch) dependant on the sound being picked up. The electrical signal is a facsimile of the sound wave. Analogue networks therefore need to respond to both amplitude and frequency variations to transmit the information accurately.
Paul Down 2009
10
Figure 4: Analogue Signals
Digital signals are transmitted as a series of electrical pulses that vary in sympathy to the original signal in various ways, (dependant on the digitisation system in use). The level or amplitude of the pulses is of no significance. Only the presence or absence of a pulse needs to be detected. Digital transmission is thus much more tolerant of poor networks. A basic explanation of digital signals is given in Appendix B. More information on videoconferencing equipment is available from the VTAS document Videoconferencing Audio & Video Equipment.
9.
The Network
Digital networks that are now almost universal have encouraged the current expansion in videoconferencing. They have enabled the medium to be affordable to a much wider market. 9.1. The Internet All schools now have access to the World Wide Web via the Internet. This technology uses a common method of transmitting and receiving data called IP or Internet Protocol standard to ensure information passes flawlessly between networks. The success of the WWW could not have been achieved without this single ubiquitous standard. A disadvantage of IP transmission is that because all data traffic e.g. e-mails, WWW downloads or videoconferencing has to share the same pipes, they all have to compete for the available space or bandwidth. With e-mails and WWW downloads, this may cause delays between sending and receiving data dependant on the traffic, but all of the transmitted data should arrive eventually at its destination although some parts of the data may have taken different routes before finishing its journey. Videoconferencing is not so tolerant. Videoconferencing transmits digitised sound and vision signals and both are very intolerant of missing or delayed parts of the data. When a network is congested and some parts of the data are routed via an alternative but longer path, the recovered sound or vision can be impaired. The degradation in quality may vary from a missing speech syllable or image fuzziness to completely Paul Down 2009 11
incoherent speech and an unrecognisable image. IP transmissions do not normally guarantee quality of service at present. However, there are techniques being used more frequently by the telecommunications providers that prioritise vision and sound signals so that they can pass relatively unscathed. 9.2. Integrated Services Digital Network (ISDN) Before the Internet achieved its present ubiquitous state and reached every school the telephone network offered its own digital communication via telephone lines. Termed ISDN (Integrated Services Digital Network), this uses the existing telephone infrastructure to carry digital signals. Offered by telecommunications providers it is available in many parts of the world. The service is accessed through dial up in the same way as a telephone, so the network does not have to be dedicated between the videoconference centres. The costs include installation (for the special interfaces), rental and call charges. The quality of ISDN videoconferencing is limited by the quality (or bandwidth) of the connection varying from 64kbit/s (ISDN-1) to 2Mbit/s (ISDN-30). The normal bandwidth used for meetings is either 128kbit/s (ISDN-2) or 384kbit/s (ISDN-6). One important feature of ISDN is its Guaranteed Quality of Service (GQOS) which means that if, for example, a 384kbit/s link is dialled then unless there is a fault condition the full 384kbits/s will be provided for the duration of the conference. The Internet, because of the way the data traffic has to compete for space, is currently unable to offer GQOS. ISDN maintains identical down and upload speeds. More information on ISDN is available at the BT ISDN website, for example. 9.3. IP and ISDN CODECs As has been previously mentioned, a CODEC (see 8.3 above) is necessary to enable the sound and vision signals to be converted and then transmitted from the local site. Another CODEC is also required at the remote site to convert the digital signals back into sound and vision signals for the audience. The method of transmission for IP and ISDN are different and so require different CODECs. ISDN, once the mainstream medium for videoconferencing, has been overtaken by IP so that IP CODECs are now usually offered as standard with an ISDN option if required. 9.4. Broadband Networks Computers connected to the Internet via a modem and telephone line are limited to a data rate of around 56Kbit/s. Broadband connections using ADSL (Asymmetric Digital Subscriber Line) technology achieve rates from 512Kbit/s up to around 16 Mbit/s. The speed of both these connections will however depend on the amount of data traffic currently on the network and as the name Asymmetric implies, the download and upload speeds differ. Generally data can be downloaded approximately twice as fast as it can be uploaded, which for domestic users is convenient. If a connection with guaranteed speed is required then the telephone provider may be able to install ISDN digital lines (see 9.2) with speeds from 64Kbit/s to 2Mbit/s. ISDN lines also have identical up and download speeds. When numerous data streams from subscribers converge to travel to other towns or countries, very high capacity networks are needed purely to cope with the density of traffic. Paul Down 2009 12
Other methods of transmission such as M-JPEG have been used, especially by the broadcasters who have access to higher bandwidths, but many of these have been replaced by IP as the technology and capacity of the transmission lines have improved.
10. Implementing Videoconferencing - the Options

10.1. Point to Point and Multipoint Conferencing between two sites is termed Point to Point and is the most frequently used method. It only requires equipment at each site and the network of choice. If more than two sites require a simultaneous conference, this is termed Multipoint conferencing, and an additional specialised piece of equipment is required, called a Multipoint Control Unit or MCU. Some videoconferencing CODECs can support limited multipoint working (e.g. up to 4 sites). Generally, MCUs are very expensive and are rented from service providers just for the duration of a conference. Educational and research organisations which are entitled to register with the JANET Video Conferencing Service (JVCS) have use of JANET(UK)s MCUs provided for the academic community. The factsheet JANET Videoconferencing Booking Service gives details of registration and the service. 10.2. Voice Switched Conferences Multipoint conferences introduce additional constraints. With Point to Point conferences both sites can see and hear each other throughout the conference. With Multipoint conferences it would be unworkable for all sites to be seeing all other sites simultaneously. Normally, therefore, the conference is voice switched. In other words, the image of the site speaking takes precedence and is seen by all the other sites. 10.3. Chairman Control Another option with Multipoint working is Chairman Control where all sites receive pictures from the Chair site alone. The Chair site will see images from the remote site currently speaking. All sites receive all sound. 10.4. Continual Presence A third option is Continual Presence where the picture is segmented to show images of each site continuously, with the sound being voice switched. With continual presence it is important to appreciate that the greater the number of sites and the smaller their individual images the less easy it is to recognise anything meaningful on the viewing screen. It does depend on viewing distance etc. but generally, continual presence is unworkable for more than four sites. The JVCS MCUs currently have an upper limit of 9 sites in continual presence mode, although this is likely to increase in the near future.
11. Data Conferencing

Material from a computer can enhance a videoconference. Spreadsheets, word processed pages, PowerPoint presentations and video sequences etc. can all be viewed by both local and remote sites if required. With appropriate technology both sites are also able to share a document and annotate it to illustrate a point; this is termed Data Sharing. Paul Down 2009 13
There are four principal methods for data sharing: 1. Within a conventional videoconference, images of the participants may be replaced with images from the computer screen. This can be useful for showing illustrations but it is not interactive (i.e. data can be seen by, but not edited by, the remote site(s)). Most CODECs are now provided with a VGA interface to permit this mode. 2. By the use of suitable software running on a PC, interactive data sharing can be established between the sites. 3. A space can be created within the data stream occupied by the conferencing image and sound signals and PC data slotted in. This enables data to be displayed and images (and sound) of the participants to be viewed simultaneously. This process does degrade sound and vision quality, and as only a limited space can be created for the PC data its usefulness is restricted, i.e. some data packages only run very slowly or not at all. This method of interleaving PC data with sound and vision data requires the CODECs to be T.120 compliant. This method has now been virtually replaced by the above options. 4. Some CODECs now offer the facility of providing two vision channels for the conference. Images of the presenter occupy the main channel along with their sound while a second vision channel may be carrying PC images or images from a second camera. No sound is carried in this second channel so if sound is required say from a DVD video sequence then this would need to occupy the main channel with the presenter in vision only on the second channel. This facility is termed Dual-video and is now offered on several CODECs. Dual-video requires the CODEC to comply with standard H.239 to ensure the feature operates effectively with other CODECs offering a second video channel.
12. Taking Part in a Videoconference - Advice for Participants

12.1. Dressing for the Camera and CODEC Videoconferencing is a form of television so the guidelines for appearing on television are relevant. Because videoconferencing uses a reduced quality network link to keep costs down some other limitations are also introduced. Television cameras can only handle a very limited range of contrast so wear clothes in pastel shades and plain weaves. Strong saturated colours and white shirts are not recommended. Within a videoconference, faces are the focal point, so clothing must not pre-dominate the image. Avoid clothes that are brightly coloured or with a distinctive pattern. Videoconferencing signals are heavily compressed by the CODEC to enable transmission over cost effective networks. Compression of the visual signal is achieved by removing elements of the picture that remain unchanged between pictures (redundant information). For a participant wearing plain clothes the relative changes in the image between successive picture frames will be concentrated on movements of the faces and arms as the image of the clothes will remain virtually unchanged. If however they are wearing a busily patterned shirt the changes will be significant. The large amount of changing information will absorb an appreciable part of the available transmission space. This Paul Down 2009 14
will have the effect of leaving less space for the more important facial images that will subsequently be degraded. See the webpage Participating in a VC for more information. 12.2. Delays in the Sound (Latency) To keep transmission costs to a minimum, the data rate for videoconferencing is generally very low. This means that the vision signals need significant compression to squeeze into the small space available. Compression requires a considerable amount of electronic processing. One penalty to pay for this is the time taken for the vision signals to travel through all the circuitry. The delays are appreciable and can be of the order of 0.25 second. The delays introduced in compressing the sound signals are very much less, as not so much signal processing is needed. The result of this is that sound and vision from a site will be transmitted (and received) out of synchronisation, unless the situation is corrected. Even small errors are objectionable as demonstrated on television by films that are transmitted with a lack of lip synchronisation. To overcome this problem in videoconferencing, the sound signals are delayed to synchronise them with the vision. Two consequences of this delayed sound are that there can be an appreciable delay introduced when conferencing with a remote site (latency) and that an echo can also be generated. This echo is most objectionable and can render a conference unintelligible. For more detail see Appendix C. 12.3. Echo Cancellers To reduce echoes caused by transmission delays, special devices known as Echo Cancellers are used. These devices when operating efficiently can almost eliminate all traces of echo during a conference. Room acoustics and microphone positions also affect the level of echo. If a microphone is moved or its sensitivity is altered during the conference, echo could be introduced until the echo canceller has realigned itself to the new environment. A good echo canceller is dynamic in operation and is able to continually monitor the situation and alter the correction as a conference is taking place to reduce echo to a minimum. They are an essential element of high quality videoconferencing. 12.4. Conferencing with Delayed Sound When videoconferencing over ISDN, IP and other networks, although echoes can be reduced to a manageable level delays on the sound or Latency remains. This introduces an unnatural element into a videoconference that participants need to be aware of. All participants have to learn to conference within the limitations of the time delay, and so interaction is less spontaneous than with a face to face meeting. As most conferences are voice switched, sharp interjections will cause the conference to switch automatically to that site; this may clip another sites presentation. All participants have to be patient and let other sites finish their point before responding. Videoconferencing with latency encourages participants to wait until others finish before responding, which after all is how meetings should be conducted in any case, but seldom are.
Paul Down 2009
15
Introduction to Videoconferencing 13. Conclusion

Videoconferencing does use fairly complex technology to achieve its aim of enabling communication. A user however does not require technical knowledge to take advantage of the medium. If you can use a PC you should have no difficulty with videoconferencing. Hopefully this paper helps to bridge the gap between being a complete novice and having sufficient understanding to realise the potential of videoconferencing. While background reading can be useful there is no substitute for experience and the reader is strongly urged to experience videoconferencing first hand by taking part in conferences. The JANET Video Conferencing Service (JVCS) offers on-line test facilities for videoconferencing and would be pleased to assist. The JANET Video Technology Advisory Service (VTAS) provides unbiased technical advice on equipment and issues related to videoconferencing. Product evaluations and a variety of documents are published on the web site.
14. Acknowledgement
I am indebted to Dr Paul Kelley, Headmaster of MonkSeaton High School, North Tyneside, for proof reading the document and offering several useful suggestions.
Paul Down 2009
16
Introduction to Videoconferencing Appendix A - Television camera lens sensitivity

Television camera sensitivity depends on two factors: the efficiency of the light sensitive area (Charge Coupled Device or CCD) that converts light energy into an electrical signal; and the amount of light that the lens transmits from the scene. As the CCD sensitivity is fixed, the only method of altering the cameras response to varying scene illumination is to adjust the amount of light entering the lens. This is achieved by means of an iris-diaphragm in the light path. This opens and closes like the iris in a human eye and increases or reduces the effective aperture through which light can enter the lens. The position of the diaphragm is indicated by f-numbers. Typical values will cover the range f2.8 f22. The lower the f number the more light is passed. One consequence of opening up the lens aperture to provide more light and thus more sensitivity is that the depth of field reduces. Depth of field is the depth of the image in sharp focus. In other words if a lens is focused on a persons eyes, at its minimum aperture, typically f22, the eyes, the nose, in fact the whole body and most likely the background as well will all be in sharp focus but the lens will only be passing a small amount of light. The scene would therefore need to be very bright to produce an acceptable camera image. If the lens is opened up to f2.8 (a typical maximum value) to pass more light then the eyes alone may remain in sharp focus but all the rest even the nose may be rendered fuzzy or defocused. There is thus a trade off between lens sensitivity and depth of field. Most cameras control the iris automatically to optimise image quality for the available light. It is important to realise, however, that in low light conditions as the iris is opened up to compensate there will be less depth of field, and the resultant picture overall may not appear to be in sharp focus
Appendix B - Analogue and digital signals

B.1 Analogue Signals Human senses respond to analogue signals. If a tuning fork is struck it vibrates to create a characteristic sound wave as illustrated in Figure B. 1. The wave is a typical analogue signal and it would gradually decay to zero as the mechanical energy in the fork is expended as radiated sound. A microphone is a device that converts sound energy into electrical energy. A microphone placed near to a resonant tuning fork would generate an electrical signal as shown below. The electrical signal is similar in shape to the original sound wave and it is a typical analogue signal.
Paul Down 2009
17
Figure B. 1: Analogue Signals
B.2 Digital Signals An electrical switch can either be OFF or ON i.e. open or closed. In Figure B.2 the switch is open and the lamp is off. With the switch closed the lamp illuminates. A single switch therefore has 2 states, OFF and ON. This can be represented by OFF=0 and ON =1. If there are two switches, we can have four or 22 possible states: Both OFF i.e. The first ON the second OFF i.e. The first OFF the second ON i.e. Both ON i.e. 00 10 01 11
Figure B. 2: The Two Binary States Represented by a Lamp
Paul Down 2009
18
If there are three switches, there can be eight or 23 possible states: A 0 0 0 0 1 1 1 1 B 0 0 1 1 0 0 1 1 C 0 1 0 1 0 1 0 1 1 2 3 4 5 6 7 The equivalent analogue number
So with 3 switches there are 8 possible combinations. It will be noticed that a 1 in the column below switch C represents 1 or 20, under switch B, 2 or 21 and under switch A, 4 or 22. This system of numbering based on the power of 2s is termed binary. A further column to the left of A would indicate an even higher power of two i.e. a 1 in this column would represent 23 (8) and so on for additional columns. B.3 Sampling If a portion of the microphones analogue electrical signal (from Figure B. 1) is sampled as illustrated in Figure B. 3. Between the points G to N, the amplitude is seen to vary between 0 and 7.
Figure B. 3: Sampling
Paul Down 2009
19
Sampling produces these values: At G H I J K L M N 0 4 6 7 7 6 4 0
However, these analogue levels may also be represented as an alternative in binary number form using the 3 switches A, B and C: A G H I J K L M N 0 4 6 7 7 6 4 0 0 1 1 1 1 1 1 0 B 0 0 1 1 1 1 0 0 C 0 0 0 1 1 0 0 0
As an alternative to 3 switches A, B and C it is possible to use electronics to generate voltage pulses. The absence of a pulse at positions A, B or C represents 0. A pulse in position A represents 4, at B, 2 and at C, 1. This is illustrated in Figure B. 4.
Paul Down 2009
20
Figure B. 4: Voltage Pulses Representing Digital Values
B.4 Analogue to Digital Conversion If an analogue electrical signal e.g. the signal from a microphone, is sampled, the samples converted to binary numbers and then pulses equivalent to the binary numbers generated (similar to Figure B. 4), the process is termed analogue to digital conversion. There is an electronic device that can perform this analogue to digital conversion called an Analogue to Digital Converter (ADC) which is an essential component of all digital systems. A comparison can be made between the sample values for our original microphone signal, their digital (or binary) numbers and the resulting digital pulse sequence. Figure B. 5 illustrates this.
Figure B. 5: Comparison between Sampling values and the digital representation
Paul Down 2009
21
Human senses respond to analogue not digital signals; in other words, we hear the noise emitted by a tuning fork as an analogue stimulus. The digital equivalent (i.e. a series of pulses) would be unrecognisable, so although digital sound signals may have advantages for travelling over a network they have to be converted back to their analogue equivalent to be meaningful to a listener. Exactly the same is true for visual information. The human eye like the ear responds to analogue stimuli so devices are needed to reconvert digital vision signals back to analogue. This is achieved by Digital to Analogue Converters (DACs). ADCs and DACs are two essential building blocks of digitisation. Figure B. 6 shows the result of Digital to Analogue conversion for our original microphone signal. As can be seen the reconstruction is not exact, if Figure B. 4 and Figure B. 1 are compared, the smooth curved edge of the original sine wave has been clipped into a series of straight lines. This is a disadvantage of digitisation. Digitisation Errors have been introduced. These errors can be minimised and made practically invisible by sampling at a high enough speed and having a sufficiently large number of reference values so providing sufficient resolution or accuracy for the sampling.
Figure B. 6: Reconstructed Signal showing Digitisation Errors
With high speed, high resolution sampling, there will be many more sampling points and so the digitised, clipped reconstructed edge will more accurately mimic the original smooth curve. The speed at which an analogue signal is sampled is known as the Clock Speed. The accuracy of the digitisation depends on the length of the digital number, (or the number of switches/pulses involved). A 3 bit system can resolve eight levels i.e. (23 or 2x2x2). An 8 bit system can resolve 28 or 2x2x2x2x2x2x2x2 or 256 different levels, whereas a 24 bit system can resolve 224 or 256x256x256 or 16,777,216 levels. Digital signals have many advantages over their analogue equivalent. On an analogue video recorder, every time a tape (or cassette) is copied the copy suffers a reduction in quality. In broadcasting, six or more generation copying is common place. With digital tape copying the sixth generation copy will be almost identical to the original with little or no degradation. This is only because the presence or absence of pulses needs to be detected. The magnitude of the pulses (unlike analogue signals) is of no consequence.
Paul Down 2009
22
Signal processing, e.g. compression of television signals for videoconferencing, is achieved much more easily in the digital domain. By using signal compression, many more digital television channels can be squeezed into the space originally taken by one analogue channel. This may not be seen to be an advantage by everyone but is particularly important for satellite transmission where bandwidth (or channel space) is at a premium and very expensive.
Appendix C - Codec audio delays

It has been mentioned that sound and vision signals do not take the same amount of time to pass through the CODEC due to the dissimilar delays through the video and audio compression electronics. As a consequence, the sound signals are delayed to synchronise them with the vision. The effect of this delayed sound is to produce a most objectionable echo when conferencing with a remote site. If this echo is not reduced, effective conferencing becomes impossible. The reason that this occurs is explained below. As shown in Figure C. 1, sound (voices) from the local site A are initially delayed by the CODE part of local CODEC A, then delayed further by the DECODE part of CODEC B at the remote site before being radiated by the loudspeaker to the remote audience. A proportion of these delayed voices from the local site are picked up by the remote sites microphone and fed back via CODE B and DECODE A now even further delayed to the local sites loudspeaker. This sound (now significantly delayed) is then picked up again by microphone A together with the live (un-delayed) voices from A, which generates the characteristic echo. These echoes continue throughout the conference unless precautions are taken to minimise them.
Figure C. 1: Echoes due to CODEC delays
Network processing equipment other than CODECs can also introduce delays. This is particularly true for IP networks where the signals have to pass through several items of equipment in the transmission chain. The latest audio coding termed MPEG4 AACLD has however been specially designed to significantly reduce this transit delay.
Paul Down 2009
23
Introduction to Videoconferencing Appendix D Useful Information Sites

If you are just starting up in videoconferencing or need more information, some useful links are summarised below. D.1 Setting-up a videoconferencing suite D 1.1 The videoconference room: http://www.video.ja.net/documents/services/video/vcrooms.pdf http://www.video.ja.net/documents/services/video/vtas/037_Planning_Rooms.pdf D 1.2.Videoconferencing audio and video equipment: http://www.video.ja.net/documents/services/video/vcaudiovideo.pdf D 1.3 VTAS product evaluations (in depth evaluations of CODECs): http://www.video.ja.net/services/video/vtas/productevaluations/index.html
D.2 Becta Excellent advice on many aspects of videoconferencing in schools with a comprehensive list of current applications in use: http://www.becta.org.uk/
D.3 Using JANET video services D 3.1 Home page: http://www.ja.net/services/video/ D 3.2 JANET Video Technology Advisory Service (VTAS): The Video Technology Advisory Service (VTAS) provides free, unbiased technical advice and information to current and potential users of video technologies. The JANET community and other organisations that may be eligible to register with the JANET Videoconferencing Service (JVCS) are entitled to use VTAS. Due to the distributed nature of the service, e-mail is the preferred mode of contact for initial enquiries. If you would like to be put in contact with a VTAS adviser, email the JANET Service Desk at: service@janet.ac.uk with VTAS in the subject line. Home page: http://www.ja.net/services/video/vtas/index.html Factsheet: http://www.ja.net/documents/publications/factsheets/024-vtas.pdf D 3.3 Registration for use of the JVCS multipoint facility: http://www.video.ja.net/documents/services/video/vtas/intorbookingserv.pdf
Paul Down 2009
24
D 3.4 Booking a multipoint conference: http://www.video.ja.net/documents/services/video/vtas/intorbookingserv.pdf
D.4 Further technical information D 4.1 Videoconferencing standards: http://www.video.ja.net/documents/services/video/vcstandards.pdf D 4.2 H.323 border traversal: http://www.ja.net/documents/services/video/vtas/h323_border_traversal.pdf
D.5 Video streaming D 5.1 Streaming software guide: http://www.video.ja.net/services/video/vtas/factsheetsuserguides/streamingsoftwareg uides.html D 5.2 Streaming product survey http://lisweb.swan.ac.uk/projects/vtas/
Paul Down 2009
25
JANET(UK) manages the operation and development of JANET, the United Kingdoms education and research network, on behalf of the combined UK Higher and Further Education Funding Councils represented by JISC (Joint Information Systems Committee). For further information please contact: JANET Service JANET(UK) Tel: 0300 300 Lumen House, Fax: 0300 300 Harwell Science and Innovation Campus E-mail: service@ja.net Didcot, Oxfordshire, OX11 0QS Desk 2212 2213
Copyright: This document is copyright The JNT Association trading as JANET(UK). Parts of it, as appropriate, may be freely copied and incorporated unaltered into another document unless produced for commercial gain, subject to the source being appropriately acknowledged and the copyright preserved. The reproduction of logos without permission is expressly forbidden. Permission should be sought from the JANET Service Desk. Trademarks: JANET is a registered trademark of the Higher Education Funding Councils for England, Scotland and Wales. The JNT Association is the registered user of this trademark. JANET(UK) is a trademark of The JNT Association. All (other) company or brand names may be trademarks of the respective companies with which they are associated. Disclaimer: The information contained herein is believed to be correct at the time of issue, but no liability can be accepted for any inaccuracies. The reader is reminded that changes may have taken place since issue, particularly in rapidly changing areas such as internet addressing, and consequently URLs and e-mail addresses should be used with caution. The JNT Association cannot accept any responsibility for any loss or damage resulting from the use of the material contained herein. Availability: Further copies of this document may be obtained from the JANET Service Desk at the above address.
Paul Down 2009
26

Introduction To Videoconferencing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Videoconferencing

Uploaded by

Copyright:

Available Formats

Introduction to Videoconferencing

Author: Paul Down Version 2.1 - Release

The Network ......................................................................................................... 11

12.1. 12.2. 12.3. 12.4. 13. 14.

Conclusion ........................................................................................................ 16 Acknowledgement ............................................................................................ 16

Paul Down 2009