This action might not be possible to undo. Are you sure you want to continue?
(2012) Podcasting for Electrical Power Systems, Conference Proceedings MIPRO 2012, IEEE, p. 1412-1417, ISBN 978-953-233-069-4
Podcasting for Electrical Power Systems
Y. Grigoriadis*, L. Fickert**, M. Ebner*, M. Schön*** and W. Nagler*
Graz University of Technology/Department for Social Learning, Graz, Austria Graz University of Technology/Institute for Electrical Power Systems, Graz, Austria *** Graz University of Technology/Lifelong Learning, Graz, Austria email@example.com , firstname.lastname@example.org , email@example.com , firstname.lastname@example.org , email@example.com
Abstract - This paper reflects six years of managing (lecture) recording activities at Graz University of Technology (TU Graz) with a special eye on the broad experiences of the Institute of Electrical Power Systems of TU Graz (IFEA) according to podcasting as well as their advantages of using this service. Furthermore the paper analyses the history, development, and management, its increase, aspects of evaluation, and didactics as well as its future trends, facing the challenges of a university wide automated recording system. The paper too presents the latest development of an integrated search functionality offered for each single recording serviced by the Department of Social Learning (DSL) of TU Graz. This is made possible on base of OCR (Optical Character Recognition) recognition of indexed video frames. Podcasting has become an integrated part of teaching activities at IFEA and at TU Graz in general. It will be further enlarged to an automated system providing high quality multimedia lecture recordings.
Podcasts revolutionized modern music market latest since Apple had conquered it by introducing its iPod and iTunes platform to the audience as well as Google´s YouTube had turned video publishing to be mere child´s play. Though a podcast by definition is an audio file distributed automatically via RSS (Really Simple Syndication) technology to the user today the term is widely used for nearly any audio and video file offered or broadcasted over the Internet. In the following paper the term podcast is therefore meant to be any kind of recording offered to students regardless whether it can be achieved via RSS or not . It is ignorant to think that podcasts will stop at school´s or university´s doors. It has become usual that students privately record lectures which they are often not allowed to. At least, because of that it has become evident for an education institution to offer recording services. At Graz University of Technology these services started as a centralized task in autumn 2006 for the entire university in the course of the formation of the Department of Social Learning as a part of the Information Technology Services (ITS). One of the first adopters of implementing the services has been the Institute of Electrical Power Systems of TU Graz. From the very first day, IFEA was a professional partner to cooperate with DSL. Since 2006 over 400 hours of recordings of lectures of IFEA could have been offered to the students. Professor Lothar
Fickert, head of IFEA, showed immediate interest on the possibilities of the medium and the way students could benefit from it. His reasons to have the lectures recorded were on one hand personal interest to effectively integrate new technologies into the teaching scenarios. A second motive for the use of this service was the fact that the head of institute in his function as dean of studies was increasingly confronted with the fact that many students could simply not be present during lecture hours because they had to work to earn their living. In order not to exclude such persons from university education, it was the aim of the head of institute, to make learning contents more easily accessible to even those students by offering podcasts. This approach is not to be mixed up with distant learning scenarios that lack phases of physical presence at the university. Clearly, the documentation and distribution of lectures by assistance of media such as audio, video, on-screen interactive systems or similar gives a peerless advantage to students of any curriculum. In this paper the author's goal is to present the current state of recording services and developments regarding the process of indexing video podcasts at TU Graz, pointing out any factors which could be particularly beneficial for students of Electrical Engineering courses. II. PODCASTING SERVICES AT TU GRAZ
The podcasting services by DSL for the entire TU Graz started in autumn 2006 with a couple of lectures being recorded on voluntary base; IFEA was one of the first to cooperate. In the course of this initial phase evaluations among the lecturers and students helped us to find an appropriate workflow from recording to output formats. After a phase of trial-and-error a simple standard was found recommending at least a minimum of didactical pre-settings . Other research works corroborate this didactical need     . The services were enriched by the installation of a distinct live streaming server in autumn 2008 and by the launch of TU Graz´s own Apple iTunes U platform for publishing media files via that way in autumn 2010. Since autumn 2010 a project for enabling fully automated recordings of lectures is going on. The project will be finished in autumn 2012. One of the first big steps has been reached by enhancing the post-recording procedure with a method that results in a text based search of the video recorded. Further development will integrate this effort into the overall process of automated recording.
Draft – Originally published in: Grigoriadis, Y., Fickert, L., Ebner, M., Schön, M., Nagler, W. (2012) Podcasting for Electrical Power Systems, Conference Proceedings MIPRO 2012, IEEE, p. 1412-1417, ISBN 978-953-233-069-4 output and effectiveness of workflow as well as general enhancements. Fig. 2 and Fig. 3 sum up the results of evaluations done between 2006 and 2008. A quarter of all asked students did go through all of the recordings, but 20% did not use any of it. More than half of the students stated that the quality is well. The most interesting fact due to didactical aspects is the most selected reason for not using recordings for learning purposes: 50% (10% of all polled students) said that recordings have no relevance for examination. But 37% of the students watching the recordings use it for repetition of lectures and specific parts of lectures . This points out that providing podcasts is a requested goody indeed, but without didactical settings that take care of the new medium the effective benefit may not be seen instantaneously. During the ongoing project of automating further evaluations are carried out to investigate on an improvement of didactical challenges. Additional oral evaluations were carried out personally by the head of IFEA, Prof. Fickert to find out student´s personal opinion about podcasting. They took place after the combined written and oral tests. There were a surprisingly high number of students to notice that reported to have listened to the podcasts in addition to studying the scripts and lecture notes. Those who have learned in this way were also able to answer specific questions in an easier way during the later examinations than other listeners who had learned only on base of a given script and their lecture notes. III. INDEXING VIDEO PODCASTS
Figure 1. Total number of recordings and recorded time
The need for automated recordings constantly increases due to the fact that all recordings are serviced manually by DSL at least according to post-processing tasks. Very often the recording itself had to be carried out by members of DSL. This limitation on manpower and equipment resources can be faced with automating. Fig. 1 gives an idea of the constant growing total number of recordings and recording time since the beginning in 2006. Every 8th recording was done for IFEA; thus IFEA still is the most powerful user of this service. Furthermore, polls and several evaluations (n=430) had been carried out, reflecting the high approval of the services in general. Nevertheless, the services constantly needed to be improved mainly regarding to quality of the
Figure 2. Results from the evaluations of 2006 to 2008 
A. Why is there a Need for Indexing Podcasts? There are several advantages for learning purposes coming along with podcasts such as repeatability of process oriented content or the fact that in case a student missed a lecture she/he may catch it up later on. Especially for content with a lot of calculations, drawings, or more generally spoken, with developing content recordings are a very nice to have for later understanding. This is particularly interesting for subjects on Electrical Engineering with lots of equations, calculations, and drafts. One of the most unpleasing disadvantages of recordings of lectures is it´s average duration which makes it hard to use in case a student is searching for a specific topic, term, or part of the lecture. She/he would have to browse more or less randomly through the podcast spending valuable time. The student needs to remember the part of the lecture the content searched for has been taught. A search functionality on base of speech and text analysis within the podcast can help a lot. Thus, indexing the podcasts is required first to ensure effortless and accurate accessibility of content in demand. There are different ways of indexing a video podcast; thumbnail generation, audio transcription, text extraction from video frames, or even user tagging. Most of these are very time consuming and cannot be carried out fully automatically at all. Since it is the project´s aim to have a completely automatic recording system installed the authors have been working on a custom method for extracting text from a screen-capture video as well as
Figure 3. Results from the evaluations of 2006 to 2008 
Draft – Originally published in: Grigoriadis, Y., Fickert, L., Ebner, M., Schön, M., Nagler, W. (2012) Podcasting for Electrical Power Systems, Conference Proceedings MIPRO 2012, IEEE, p. 1412-1417, ISBN 978-953-233-069-4 thumbnail generation as the first step in the process of making podcasts searchable. B. Overview of the Methods The general recording service provides a recording of the lecturer´s spoken word using wireless microphones (Sennheiser EW 100 3G) and a screen-capturing of the lecturer´s laptop which by default is similar to the projector´s output for the students. According to the screen-capturing the lecturer´s presentation is being captured by screen-capturing software. Very often the presentation is a sequence of prepared slides (Microsoft PowerPoint, Apple Keynote, or comparable software) but not exclusively. Lectures held with the use of a tablet PC allow emerging content during the lecture by writing and drawing on the tablet´s surface. Those handwritten screencaptures are hard to be indexed for the later search functionality. Nevertheless they should be given a try too. The goal of the procedure described in the following is to extract the useful information (text) directly from the video but not from the presented file. At IFEA customized software for presentations under the name “IFEA Viewer”  had been developed and generally used. The pre-chosen fonts of this program ensure flawless text extraction and the design of the viewer’s main layout allows picture-in-picture (PiP) integration, which means that a camera taken video stream of the presenter/lecturer can be embedded in the main screen-capture video. Key technologies for text extraction and indexing are Optical Character Recognition and the capabilities of the Advanced Video Codec (AVC). OCR refers to algorithms which extract the text information in electronic form from image documents such as formats TIFF and JPEG (scanned or digitally generated) with the purpose of making the text editable, searchable, more compact for storing, just to name a few. The AVC (also known as H.264-codec) is the codec used for the production of the final video podcast within an MP4-container. During the encoding, essential information for the indexing procedure concerning the type of the frames as well as their temporal position in the video file (timestamp) is delivered from the AVC. It is thus meaningful that the indexing procedure is smart integrated in the main editing/encoding routine in a manner that facilitates the information flow between the active modules. In this case the editing/encoding routine consists of a Perl script that automatically carries out most of the tasks of the podcast production, coordinating the tools used as well as introducing inherent algorithms for text filtering or frame sorting. The overall flow of the podcast production can be reduced to the scheme shown in Fig. 4. The detailed description is shown in Fig. 5 and is described in the following chapter. At this point of the paper the term TeachCenter within Fig. 5 needs a short explanation. The TeachCenter is the Learning Management System (LMS) of TU Graz since 2006. It bases on the platform called WBTMaster that has been developed and used by the team of Prof. Nikolai Scerbakov at the Institute for Information Systems and
Figure 4. The overall editing/encoding routine
Figure 5. The editing/encoding routine in details
Computer Media of TU Graz (IICM) since the later 1990s    . The input file is produced with screen-capturing software Camtasia Studio by TechSmith (for Microsoft Windows) and iShowU (for Apple Mac) delivering highquality videos encoded with (almost) lossless codecs. The audio signal is an uncompressed Puls-Code-Modulation (PCM) formed signal with 16 bit, 44100 Hz, mono character. The final video product is being embedded in a HTML page using Adobe Flash Player for controlling the play-back and featuring a Table of Contents (ToC) as well as the search functionality for tracking specific terms within the video. The final video itself is an AVC encoded video with Advanced Audio Codec (AAC) audio in an MP4-container. Typical settings for the video file are 5-10 fps (frames per second), CQ setting (Constant Quality) 32, 40 kbps audio, resulting in podcasts with a good quality to size ratio. For the video/audio encoding, as well as the frame extraction for the indexing procedure, FFmpeg is used. FFmpeg is open-source versatile audio/video manipulation software which includes libavcodec – the leading audio/video codec library used by popular multimedia players/converters such as MPlayer and VLC media player. C. Applying OCR on Video Files As described before OCR refers to methods for extracting text from image files. Several OCR programs offer different levels and complexities; there are for example plain text extractors with limited features (usually included with a new scanner), simple online (server-based) OCR tools, but also more elaborate document converters which can extract text in multiple scripts and languages, recognize graphics, and even fully
Draft – Originally published in: Grigoriadis, Y., Fickert, L., Ebner, M., Schön, M., Nagler, W. (2012) Podcasting for Electrical Power Systems, Conference Proceedings MIPRO 2012, IEEE, p. 1412-1417, ISBN 978-953-233-069-4 reproduce the layout of a document. Unfortunately, the authors research for open-source or commercial software that can directly extract text from video files was less effective. Therefore it was essential to come up with a custom way of applying OCR to the screen-capture input files described above. The obvious solution was to extract the frames from a video file in a form compatible with OCR software (e.g. JPG, PNG, TIFF) and then apply OCR to these image files. By following this path several research questions arose: • • • Which software to use? Which frames to extract? How can time information (essential for indexing) be obtained? Before taking care of the second question regarding which frame to extract, first it must be outlined that a video with duration of 90 minutes and a frame rate of 5 fps contains 27.000 frames. To extract and apply OCR to all of them would be impractically. Fortunately, not all 27.000 frames contain useful information. Only a small fraction of the total frames is actually needed, since most of them are identical or differ inconsequential. More precisely, the frames that need to be extracted are those with significant change in comparison to their preceding frames content (e.g. two different presentation slides). There is an intelligent approach that has to do with the way modern MPEG-4 video codecs operate. Video files mainly consist of a series of images. When these images are displayed in sequence at a constant rate; the effect of motion is achieved. Due to the large number of images that a video file usually contains, its size can become extremely large. In order to make video files smaller, video codecs with algorithms for reducing the information without severely degrading the quality emerged. These codecs take advantage of the fact that a great amount of information between neighboring frames is identical or very similar; the general idea of such algorithms is to discard the repetitive information while keeping the essential content. In the family of MPEG-4 codecs the Group of Pictures (GOP) model was introduced. These codecs produce video files with different types of frames: the I-frames, which are frames that keep their full initial information intact, as well as P and B frames, which are frames that contain only a part of their initial content,
Focusing the first question regarding the software to use there were distinct requirements to be fulfilled. The main two requests were to have a command line version for to use scripts and high-performance software because of the fact that extracting one frame per second of a video file and then filtering out frames with identical content consumes very much time and processing power. The brainstorming and research that followed resulted in the idea of exploiting features of the FFmpeg tool and the AVC to obtain essential information for the indexing procedure. So finally FFmpeg has the ability to extract frames from a video file in many different formats. It is a command line tool, thus it can be used in a script. Plus, it is the tool used for encoding the final video. Therefore, with one program several tasks are carried out. Here are two examples of its usage according to our tasks: • Encoding a video file: $ ffmpeg -i <inputfile> -ac 1 -ab 40k -vcodec libx264 -fpre <codec_preset> -crf 23 -vstats_file <outputfile> with: -i: name of the input video file -ac: number of audio channels -ab: audio bitrate -vcodec: video codec library -crf: constant rate factor -vstats_file: generation of -vstats file • Extracting a specific frame from a video file: $ ffmpeg -ss <offset> -i <inputfile> -an -vframes 1 -qscale 1 <outputfile> with: -ss offset: (time of frame to be extracted) in seconds -an: no audio -vframes: number of consequent frames to extract -qscale: quality factor (1[best] to 31[worst]) Be sure that the output file ends with “.jpg”.
Figure 6. A Group of Pictures (GOP)
Figure 7. A frame sequence record
Draft – Originally published in: Grigoriadis, Y., Fickert, L., Ebner, M., Schön, M., Nagler, W. (2012) Podcasting for Electrical Power Systems, Conference Proceedings MIPRO 2012, IEEE, p. 1412-1417, ISBN 978-953-233-069-4 subject to their preceding and possibly following I-frames for their correct reproduction. Fig. 6 points out the principle and Fig. 7 a detailed example. In the process of encoding a video file these codecs compare the content of consecutive frames to detect significant alterations and, in that manner, decide which frames should become I-frames. It is this feature that is exploited in order to extract only the useful frames for OCR and indexing. As already mentioned, FFmpeg features an option (-vstats_file) that allows the dumping of encoding information (such as frame types, size, timestamps etc.) in a text file during the encoding process of a video file. After the encoding is completed, this text file contains all the needed information for extracting the frames which are useful for OCR and indexing. Fig. 7 is an example of how this information is being recorded; compare types of the frames listed. Since a maximum distance between two I-frames needs to be set, it may occur that I-frames are identical when this distance is reached a new I-frame is generated regardless of its content. But there is still the possibility of comparing the sizes of the I-frames to determine whether they are actually different from each other. This comparison is made with Perl and is one of the custom algorithms that are used in the indexing procedure. Another one is the search for I-frames that are too close to one another by comparing their timestamps; if for example a lecturer searches for a specific slide going through several slides in a relatively fast tempo, only the last I-frame will be extracted, since the others are not useful for indexing. The procedure can be described as follows. The frames can be thought of as a sequence: ..., f [n–1], f [n], f [n+1], ... IF |fs[n–1] – fs[n]| < S OR ft[n] – ft[n–1] < T THEN discard the current frame f [n] with n: number of the frame fs: size in bytes ft: time in ms S: deviation parameter for the size T: deviation parameter for the time Finally according to question three “How can timing information (essential for indexing) be obtained?” the answer was given before. The -vstats option of FFmpeg delivers information about the encoded frames which includes timestamps. During frame extraction, the timestamp of each extracted frame is saved within the name of the image file created. For example: “120126_432002_fickert_0324400.jpg” with “0324400” is the timestamp in msec.
This paper outlines the consequent development and enhancement of recording services at TU Graz to become an integrated part of didactical methods for teaching and learning purposes; from its beginning to the ongoing project aiming to automate the entire workflow. One essential step forward has been achieved by implementing search functionality to the output videos offered to the students. The procedure for making long video podcasts searchable described above is only a part of the greater project. Other disciplines for indexing are planned to be researched and finally integrated in the workflow. Some of these disciplines are Speech-to-Text applications and the introduction of interactive elements like user tagging. Speech-to-text is a rapidly advancing field which has lately shown positive signs, when a few years ago it seemed unusable. Having the possibility to extract text information from the voice recording of the lecturer not only makes indexing much more complete, but it also allows for use of captions or subtitles in the video podcasts making them useful for the hearing impaired. As for the user tagging, it is a way of having the automatically generated index controlled, enriched, and even corrected by the users themselves, which gives elearning elements of a living organism that is constantly evolving, while enhancing the quality of the learning content without increasing the resources needed for its initiation and maintenance. In practice, an advantage of the method of podcasting lies in the possibility for working students to participate in the lecture activities without physical presence and to enable them to complete in some cases the lectures with good results even without visiting the classroom. Even more, the personal oral evaluations result positive effects on the learning outcome. But it must be definitely pointed out that it is not a goal to replace attendance of students at university by offering recorded lectures; TU Graz is not a distance university. Nevertheless, the close contact and the exchange of substantial information between DSL and the various institutes and lecturers, as well as the students, is essential for the achievement of the goals described above. The cooperation between DSL and IFEA can be regarded as an excellent example of such collaboration. REFERENCES
C. Dale, “Strategies for Using Podcasting to Support Student Learning”, Journal of Hospitality, Leisure, Sport and Tourism Education, vol. 6(1), 2007, pp. 49-57. D. Helic, H. Maurer and N. Scerbakov, “Knowledge Transfer Processes in a Modern WBT System,” Journal of Network and Computer Applications, vol. 27(3), 2004, pp.163-190. G. Campbell, “There´s Something in the Air - Podcasting in Education”, EDUCAUSE, vol. 40(6), 2005, pp. 32–47. H. Maurer and N. Scerbakov, “Multimedia Authoring for Presentation and Education: The Official Guide to HM-Card”, Bonn: Addison-Wesley. February 1996, pp. 250. L. Fickert, E. Schmautzer, W. Nagler, I. Kamrat and C. Stojke, “Experiences and Adaptation of Teaching Concepts in the Field of Multimedia Learning for Electrical Power Systems at the University of Technology Graz”, Region 8 Eurocon 2004, The
(1) (2) (3) (4)
Draft – Originally published in: Grigoriadis, Y., Fickert, L., Ebner, M., Schön, M., Nagler, W. (2012) Podcasting for Electrical Power Systems, Conference Proceedings MIPRO 2012, IEEE, p. 1412-1417, ISBN 978-953-233-069-4
International Conference on Computer as a Tool, Ljubljana, Slovenia 2004  M. Blaisdell, Academic MP3s: Is it time yet? Educause 2006.  M. Ebner, N. Scerbakov, C. Stickel H. Maurer, “Mobile Information Access in Higher Education,” E-Learn 2008, Las Vegas, pp. 777-782.  M. Ebner, W. Nagler and A Saranti, „TU Graz goes Podcast.,” Micromedia and Corporate Learning - Proceedings of the 3rd International Microlearning 2007 Conference. Innsbruck, Austria, pp. 221–233.  N. Townend, ”Podcasting in Higher Education”, Media Onlinefocus 22, British Universities Film & Video Council, 2005.  P. Edirisingha and G. Salmon, “Pedagogical models for podcasts in higher education”, EDEN Annual Conference 2007. Naples, Italy.  T. Dietinger and H. Maurer, “GENTLE – General Network Training and Learning Environment,” ED MEDIA 1998 / EDTelecom 1998, Freiburg, Germany, pp. 274-280.  T. Y. Huann, and M. K. Thong, “Audioblogging and Podcasting in Education”, 2006  W. Nagler, A. Saranti and M. Ebner, „Podcasting at TU Graz How to Implement Podcasting as a Didactical Method for Teaching and Learning Purposes at a University of Technology”, ED-MEDIA - World Conference on Educational Multimedia, Hypermedia & Telecommunications 2008, Vienna, Austria, pp. 3858-3863.  W. Nagler, Y. Grigoriadis, C. Stickel and M. Ebner, “Capture Your University,” IADIS International Conference e-Learning 2010, pp. 139-144.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.