Audio-Video mixing is an important element of cinematography.Most videos such as motion pictures and sitcoms have several sections without any speech.Adding thoroughly picked music to such sectors communicates emotions such as happiness, tension or melancholy. In a normally professional video production, knowledgeable audio-mixing artists aesthetically include proper audio to the offered video shot. This process is tiresome, lengthy and costly. The PIVOT VECTOR SPACE METHOD in audio blending is an unique method that automatically selects the very best audio clip (from the available database) to mix with the given video shot.
This technique utilizes a pivot vector space mixing structure to include the artistic heuristics for mixing audio with video.
This technique eliminates the need for expert audio mixing artists and thus it is not expensive.It likewise saves time and is really convenient. In today’s age, significant advances are occurring constantly in the field of Infotech. The development in the IT associated fields such as multimedia is very vast.
This is obvious with the release of a range of multimedia products such as mobile handsets, portable MP3 gamers, digital video camcorders, handicams and so on. Hence, certain activities such as production of home videos is simple due to products such as handicams, digital video camcorders and so on.
Such a circumstance was not there a years back, considering that no such items were offered in the market. As an outcome production of home videos is not possible since it was reserved entirely for expert video artists. So in today’s world, a large quantity of home videos are being made and the number of amateur and house video enthusiasts is extremely large.
A home video artist can never ever match the visual capabilities of an expert audio mixing artist. However utilizing an expert mixing artist to establish house video is not feasible as it is costly, tedious and time consuming.
The PIVOT VECTOR SPACE APPROACH is a novel technique of audio-video mixing which automatically selects the best audio clip from the available database, to be mixed with the given video shot. Till the development of this technique, audio-video mixing is a process that could be done only by professional audio-mixing artists. However employing these artists is very expensive and is not feasible for home video mixing. Besides, the process is time-consuming and tedious.In today’s era, significant advances are happening constantly in the field of Information Technology. The development in the IT related fields such as multimedia is extremely vast. This is evident with the release of a variety of multimedia products such as mobile handsets, portable MP3 players, digital video camcorders, handicams etc.
Hence, certain activities such as production of home videos is easy due to products such as handicams, digital video camcorders etc. Such a scenario was not there a decade ago ,since no such products were available in the market. As a result production of home videos is not possible since it was reserved completely for professional video artists.So in today’s world, a large amount of home videos are being made and the number of amateur and home video enthusiasts is very large.A home video artist can never match the aesthetic capabilities of a professional audio mixing artist. However employing a professional mixing artist to develop home video is not feasible as it is expensive, tedious and time consuming. Fig(1) PivotVectorRepresentation
Movies comprise images (still or moving) ;graphic traces(texts and signs);recorded speech, music, and noises; and sound effects. The different roles of music in movies can be categorized into :– Setting the scene(create atmosphere of time and place) Adding emotional meaning , Serving as a background filler, Creating continuity across shots or scenes, and Emphasizing climaxes(alert the viewer to climaxes and emotional points of scenes). The links between music and moving images are extremely important, and the juxtaposition of such elements must be carried out according to some aesthetic rules. The scientist Zettl explicitly defined such rules in the form of a table, presenting the features of moving images that match the features of music.
Zettl based these proposed mixing rules on the following aspects:– Tonal matching(related to the emotional meaning defined by Copland) Structural matching(related to emotional meaning and emphasizing climaxes defined by Copland) Thematic matching(related to setting the scene as defined by Copland) Historical-geographical matching(related to setting the scene as defined by Copland) In the following TABLE ,we summarize the work of Zettl by presenting aesthetic features that correspond in video and music. The table also indicates extractable features because many video and audio features defined by Zettl are high level perceptual features and can’t be extracted by the state of the art in computational media aesthetics.
The table shows, from the cinematic point of view,a set of attributed features(such as color and motion) required to describe videos.The computations for extracting aesthetic attributed features from low-level video features occur at the video shot granularity. Because some attributed features are based on still images(such as high light falloff),we compute them on the key frame of a video shot. We try to optimize the trade-off in accuracy and computational efficiency among the competing extraction methods. Also, even though we assume that the videos considered come in the MPEG format(widely used by several home video camcorders),the features exist independently of a particular representation format.
Light falloff refers to the brightness contrast between the light and shadow sides of an object and the rate of change from light to shadow. If the brightness contrast between the lighted side of an object and the attached shadow is high, the frame has fast falloff. This means the illuminated side is relatively bright and the attached shadow is quite dense and dark. If the contrast is low, the resulting falloff is considered slow. No falloff(or extremely low falloff) means that the object is lighted equally on all sides.
The color features extracted from a video shot consists of four features:- Saturation
To measure the video segment’s motion intensity, we use descriptors. They describe a set of automatically extractable descriptors of motion activities, which are computed from the MPEG motion vectors and can capture the intensity of a video shot’s motion activity. Here we use the max2 descriptor, which discards 10 percent of the motion vectors to filter out spurious vectors or very small objects
Music perception is an extremely complex psycho-acoustical phenomenon that is not well understood. So instead of directly extracting the music’s perceptual features, we can use the low-level signal features of audio clips, which can provide clues on how to estimate the numerous perceptual features.
We described here the required basic features that are extracted from an audio excerpt. Spectral centroid
The spectral centroid is commonly associated with the measure of a sound’s brightness.We obtain this measure by evaluating the center of gravity using the frequency and magnitude information of Fourier transforms.The individual centroid C(n) of a spectral frame is the average frequency weighted by the amplitude ,divided by the sum of the amplitude.
In the context of discrete-time signals, a zero crossing is said to occur if two successive samples have opposite signs. The rate at which Zero crossings occur is a simple measure of the frequency content of the signal.This is particularly true of the narrowband signals. Because audio signals might include both narrowband and broadband signals, the interpretation of the average zero-crossing rate is less precise. However, we can still obtain rough estimates of the spectral properties using a representation on the short-time average zero-crossing rate.
The volume distribution of audio clips reveals the signal magnitude’s temporal variation. It represents the subjective measure, which depends on the human listener’s frequency response. Normally volume is approximated by the root mean square value of the signal magnitude within each frame. VDR(v)=[max(V)-min(v)]/max(V) PERCEPTUAL FEATURES EXTRACTION
Dynamics refers to the volume of musical sound related to the music’s loudness or softness, which is always a relative indication, dependent on the context. Tempo features
One of the most important features that makes the music flow unique and differentiates it from other types of audio signal is temporal organization(beat rate)
Pitch perception plays an important role in human hearing, and the auditory system apparently assigns a pitch to anything that comes to its attention.
Before the development of the PIVOT VECTOR SPACE APPROACH IN AUDIO-VIDEO MIXING process can be carried out only by professional mixing artists.
In today’s era the development in the field of MULTIMEDIA technology is so vast as this can be seen with the releases of a number of multimedia products in the market. Products such as Digital video camcorders, Handicams greatly helped even normal home users to produce their own video. However, employing professional audio-mixing artists is not feasible since it is expensive, time-consuming and tedious.
The Pivot vector space approach enables all the home video users and amateur video enthusiasts to give a professional look and feel to their videos. This technique also eliminates the need for professional mixing artists and hence saves cost. Besides, it is not time-consuming.
Since this approach is fully automatic as it automatically selects the best audio clip (available from the given database) to be mixed with the given video shot ,the user need not worry about his aesthetic capabilities in selecting the audio clip. The Pivot vector space approach enables all the home video users and amateur video enthusiasts to give a professional look and feel to their videos. This technique also eliminates the need for professional mixing artists and hence saves cost. Besides, it is not time-consuming. Since this approach is fully automatic as it automatically selects the best audio clip (available from the given database) to be mixed with the given video shot ,the user need not worry about his aesthetic capabilities in selecting the audio clip.
Fig(4): Application of Audio Video Mixing.
In today’s INFORMATION TECHNOLOGY era ,the advances in the various IT fields such as MULTIMEDIA,NETWORKING etc is very fast.Newer and better technologies arise as each day passes. This is evident with the release of a number of Technology packed products such as portable MP3 players,digital cameras,digital video camcorders,Handicams,Mobile handsets etc.
Before the advent of such technologies, activities such as Production of videos etc could be done only by professional video artists. However in today’s era,with the releases of products such as Handicams,Digital video camcorders;production of videos is easy for all the home video users and amateur video enthusiasts.As a result, a large amount of home video footage is being produced now.
The PIVOT VECTOR SPACE APPROACH is a novel technique for these users since it is able to provide a professional look and feel to these videos.It eliminates the need for professional mixing artists and hence cuts down the cost ,time and labour involved.Hence,the demand for such a technique will be only increasing in the coming years .This technique will definitely have a great impact on the IT market today.
The PIVOT VECTOR SPACE APPROACH is a new dimension in the field of AUDIO-VIDEO mixing. Before the advent of this technology, audio-video mixing was a process carried out only by professional mixing artists. However, this process is expensive, tedious and time-consuming. This entire scenario changed with the emergence of the PIVOT VECTOR SPACE approach. Since this technique is fully Automatic, it enabled a home video user to provide a professional look and feel to his video. This technique also eliminates the need for professional mixing artists, thereby significantly reducing the cost, time and labour involved. In today’s era,a large amount of home video footage is being produced due to products such as Digital video camcorders, Handicams etc.Hence,this technique will be of great use to all the amateur video enthusiasts and home video users.