Organizers: Giuseppe Valenzise, Patrick Le Callet
Organizers: Nikolaos Stefanakis, Athanasios Mouchtaris, Paris Smaragdis
Organizers: Francesca De Simone, Pascal Frossard, Neil Birkbeck, Balu Adsumilli
Organizers: Saverio Blasi, Farzad Toutounchi, Vasileios Mezaris, Marta Mrak
In the past ten years, the tools for generating, distributing and rendering immersive video have stably entered in the research agenda of the multimedia community. This has led to new video formats able to provide higher interactivity and increased sense of realism. Nowadays, immersive video starts to be available to the public at large, thanks also to the significant industrial investments and ongoing activities in the video standardization bodies. A considerable challenge, therefore, is measuring the quality of experience offered by these emerging video formats to users. This special session looks to gather and discuss some of the most recent and significant results in both subjective and objective quality assessment of immersive-enabling video representations. Those include, but are not limited to, point clouds and meshes, light fields, super-multiview and integral images, as well as ultra-high definition video.
We invite submissions in the area of User Generated Content (UGC), encouraging innovative research methods in the area of automatic organization of UGC, with focus on methodologies to automatically discover, match, group and synchronize overlapping audio (or video) files of the same event. Additionally, we invite complete or visionary papers in the area of collaborative processing of UG audio recordings, where, given a collection of overlapping UG audio streams, the taks is to combine the available content with the goal to improve the overall acoustic representation of the captured event. Some possible topics along these directions are automatic UGC organization, multi-microphone sound scene analysis, mixing of geographically spread UG audio recordings, multichannel sound reproduction based on UGC, sound source/scene enhancement and source/scene separation, foreground suppression and privacy preservation and combination of UGC with professional content. To support contributions in these topics, we provide at doi.org/10.5281/zenodo.164175 two open access audio datasets with user generated audio recordings in PCM format. In addition to the above UGC-related topics, submissions on established multi-microphone signal processing areas are also encouraged, including but not limited to ad-hoc microphones and microphone arrays, Wireless Acoustic Sensor Networks (WASNs), multi-microphone processing for spatial audio and immersive sound reproduction.
Fully omnidirectional cameras, able to instantaneously capture the 360° surrounding scene, have recently started to appear as commercial products and professional tools. In addition, innovative applications exploiting omnidirectional images and videos are expected to develop very fast in domains like virtual reality or robotics. While the popularity of 360° content and applications using such content is rapidly increasing, many technical challenges at different steps of the omnidirectional signal acquisition, processing and distribution chain still remain open: new strategies to efficiently capture, represent, compress, process, and distribute these new digital visual signals, exploiting knowledge on the signal acquisition process as well as on user’s navigation patterns, are needed. This special session aims to bring together researchers and practitioners from academia and industry to present and discuss open challenges and proposed solutions addressing the different steps of the omnidirectional image and video processing chain, from the signal acquisition, till the rendering to the user. Topics of interest include spherical signal representations (such as sphere to plane projections, mesh-based representation, etc), 360° cameras and stitching algorithms, rendering strategies for 360° content via head-mounted displays, streaming strategies for 360° videos, compression of 360° images and videos, analysis and models of viewport navigation patterns for 360° content, spherical signal processing (sampling, filtering, transforms in spherical domain), standardization efforts for 360° signal representation, streaming and compression, quality assessment methods for 360° images and videos.
With wide availability of consumer products that provide high quality visual experiences at home as well as content capture from mobile devices, many vibrant examples of fascinating new applications and services can be unleashed. Such products enable new forms of media such as User Generated Content (UGC) to reach not only social media services, but also broadcast programmes, which are typically characterised by high content quality.
This challenging media landscape is underpinned by numerous video processing technologies that support efficient media production and delivery. The aim of this special session is to present some of the most recent findings from the research community, to tackle the new exciting challenges required to enable this next generation of multimedia services. Topics that will be covered include but are not limited to: encoding and delivery of UGC and legacy content (fast and efficient video compression of UGC or legacy material, adaptive streaming at low-to-high bitrates, etc.); processing and enhancement of UGC and legacy content (upsampling, super-resolution, image denoising, dynamic range adaptation, etc.); assessment and analysis of UGC and legacy content. This special session is co-organised by H2020 European projects COGNITUS (grant agreement No 687605) and InVid (grant agreement No 687786) which focus on usage and verification of new forms of content.
Authors are encouraged to verify their work using the Edinburgh Festival open dataset that will be made available in the context of the H2020 European project COGNITUS