Professional Documents
Culture Documents
Mark Cracknell
Project Manager This dataset was capture from the existing TfL CCTV system, which comprises over 1200 cameras. Each camera is transmits full colour video back to a central CCTV matrix where it is distributed to over 500 users.
Mark Cracknell carried out. It is typical in academia to annotate the video on a per-frame basis, that is to say every frame within the video is annotated, describing all vehicles in the scene and their position. This serves a useful purpose when testing a newly formed algorithm. However, TfL is only interested in the application layer of Image Detection systems testing. The groundtruth associated with the TfL Dataset contains event level ground truth. That is, each video clip is annotated with a specific scenario in mind, for example No right turn. In this case the ground truth consists of a number of timestamped events, detailing each vehicle turning right at a particular junction. Whilst this does not give any significant information about the video clip it is of sufficient level for TfL purposes. As an end-user the primary concerns of an image detection system are: False Events and Missed Events. As a user of the system the number of times the systems correctly or incorrectly alarms is more important than whether or not the car was seen 1 frame after it appears. Once the Video is collected and the ground-truth collated the dataset is ready for use. Smart Camera evaluation There are a number of different ways to make use of image detection technology. TfL primarily use existing infrastructure. The advantage of this is that as the infrastructure is already in place the costs are reduced. The design of TfLs CCTV network means that all the CCTV feeds are brought directly back to a central location. This means that in a single location we have a large number of analogue video feeds from which to process. In this case having a centralised processing approach to image detection makes sense. However there are circumstances where this is not ideal or even possible. London is a very large city and unlike many northern American cities its street layout means that there are often many junctions where there is no existing CCTV coverage. In some cases there are key strategic sites which do not have CCTV coverage. In these cases it would be beneficial to have a monitoring capability. However the Traffic Transport for London DTO - T&S D&R
To ensure that all required information was captured in the datasets a confusion matrix was drawn up similar to below: Scenario Congestion Congestion Turns Counts Weather Sun Rain Fog Wet Road Type Single Dual Single Dual Light Level Bright Sun Dark Overcast Dark
The process of compiling a video dataset is twofold; Firstly to capture the required video segments and edit them to suitable length, secondly to ground-truth the video. Groundtruthing describes the process of manually reviewing a video clip to annotate what is happening within the clip. This annotation forms the baseline data against which any tests are referenced. There are a number of different levels of groundtruth, dependant on the detail of testing to be Project IRID- Phase 2 2 of 4
Project Manager controllers already have too many cameras to watch so increasing this burden may not prove the best solution. To propose a solution to this TfL is currently investigating smart camera technology. This approach is in contrast to the centralised processing approach, where the intelligence is moved out to street. Where there is a requirement to install new cameras at unmonitored junctions there is a real benefit in including image detection hardware also. Smart camera technology can take on a number of forms including; all-in-one camera solutions, with processing built into the camera housing; Edge devices, which sit in a road side cabinet; or codec based architecture which utilises unused processing power in the digital codecs to perform simple image detection. A trial of 3 different technology approaches (as highlighted above) was carried out to determine what capabilities are available and whether there is a loss or increase of functionality and performance when compared to server based processing. Congestion Monitoring system deployment The key work stream in this project was the deployment of a 20 camera congestion detection system. TfL successfully proved that there is value in using image based detection for congestion detection back in 2007. Following on from there TfL has deployed a 20 camera system for this purpose. Initial roll-out concentrated on 20 key sites as selected by relevant stakeholders. The stakeholders are primarily the Traffic controllers who are using this system day-in day-out. The 20 sites highlighted by the Traffic controllers are strategic sites which, if congested will cause severe problems elsewhere on the road network. These sites act as early warning signs for congestion problems. Each monitored camera is configured individually as every site requires its own definition of congestion. In some cases stationary traffic for 20s may be normal but for other any stationary traffic is unusual. These Project IRID- Phase 2 3 of 4
Mark Cracknell cameras are not used exclusively for Image detection but are available for use by a large number of operators. This presents a problem as all of TfLs cameras at Pan-Tilt-Zoom (PTZ). The deployed system had to be designed to ensure that if a camera was moved from its configured or home position that it would suspend processing until it was returned home. Once the system has detected congestion it must alert the traffic controllers so that they can take appropriate action. Multi sensor alarms including audio and visual are delivered to ensure that alerts are received and recognised by the users.
A three tier architecture was implemented in order to preserve current network integrity. Tier 1: Video servers, linking directly into the CCTV matrix. These perform the video analysis. Tier 2: Alert server: A remote machine placed elsewhere on the network. This machine aggregates data from each Video server and delivers this data to the user. The Alert server uses a web-service as the delivery method for user alerts. Tier 3: User desktop, Users will connect to the Alert server via a standard web browser. The Video Servers are 19 rack mounted servers optimised for minimal power consumption and heat dissipation. The Alert Server is a virtual machine available on the TfL Network running a web service. Background data traffic flows from the Video servers to the Alert server delivering alert data. Alerts are delivered to standard PC clients running any web browser. Transport for London DTO - T&S D&R
Project Manager
Mark Cracknell
The implementation of this system is in line with TfLs greater goal of providing 24/7 real time traffic operations to give journey time reliability. By detecting congestion build-up quicker and implementing relief strategies sooner will reduce the affect of congestion on road-users. The reduction of congestion benefits not only journey times, but journey time reliability, vehicle emissions and fuel consumption. Any automated system runs the risk of being redundant if it erroneously alerts the user too often. Steps were taken to minimise this risk including maximising accuracy and the delivery of the alerts to the appropriate user. Conclusion As this project is nearing completion the full results are not yet available. The project is due to have tabulated all the data by May 2008 when the following conclusions will be drawn and comprehensively discussed in the presentation. Comments on the use of Smart camera technology including; accuracy, reliability, functionality and scalability. Comments on successes and issues of installing a 20 camera congestion monitoring system including user feedback and impact on the road network.
4 of 4