Arama Sonuçları

Listeleniyor 1 - 5 / 5
  • Yayın
    A survey of algorithms and architectures for H.264 sub-pixel motion estimation
    (World Scientific, 2012-05) Fatemi, Mohammad Reza Hosseiny; Ateş, Hasan Fehmi; Salleh, Rosli Bin
    This paper reviews recent state-of-the-art H. 264 sub-pixel motion estimation (SME) algorithms and architectures. First, H.264 SME is analyzed and the impact of its functionalities on coding performance is investigated. Then, design space of SME algorithms is explored representing design problems, approaches, and recent advanced algorithms. Besides, design challenges and strategies of SME hardware architectures are discussed and promising architectures are surveyed. Further perspectives and future prospects are also presented to highlight emerging trends and outlook of SME designs.
  • Yayın
    Kernel likelihood estimation for superpixel image parsing
    (Springer Verlag, 2016) Ateş, Hasan Fehmi; Sünetci, Sercan; Ak, Kenan Emir
    In superpixel-based image parsing, the image is first segmented into visually consistent small regions, i.e. superpixels; then superpixels are parsed into different categories. SuperParsing algorithm provides an elegant nonparametric solution to this problem without any need for classifier training. Superpixels are labeled based on the likelihood ratios that are computed from class conditional density estimates of feature vectors. In this paper, local kernel density estimation is proposed to improve the estimation of likelihood ratios and hence the labeling accuracy. By optimizing kernel bandwidths for each feature vector, feature densities are better estimated especially when the set of training samples is sparse. The proposed method is tested on the SIFT Flow dataset consisting of 2,688 images and 33 labels, and is shown to outperform SuperParsing and some of its extended versions in terms of classification accuracy.
  • Yayın
    Enhanced low bitrate H.264 video coding using decoder-side super-resolution and frame interpolation
    (SPIE-SOC Photo-Optical Instrumentation Engineers, 2013-07) Ateş, Hasan Fehmi
    Advanced inter-prediction modes are introduced recently in literature to improve video coding performances of both H.264 and High Efficiency Video Coding standards. Decoder-side motion analysis and motion vector derivation are proposed to reduce coding costs of motion information. Here, we introduce enhanced skip and direct modes for H.264 coding using decoder-side super-resolution (SR) and frame interpolation. P-and B-frames are downsampled and H.264 encoded at lower resolution (LR). Then reconstructed LR frames are super-resolved using decoder-side motion estimation. Alternatively for B-frames, bidirectional true motion estimation is performed to synthesize a B-frame from its reference frames. For P-frames, bicubic interpolation of the LR frame is used as an alternative to SR reconstruction. A rate-distortion optimal mode selection algorithm is developed to decide for each MB which of the two reconstructions to use as skip/direct mode prediction. Simulations indicate an average of 1.04 dB peak signal-to-noise ratio (PSNR) improvement or 23.0% bitrate reduction at low bitrates when compared with H.264 standard. The PSNR gains reach as high as 3.00 dB for inter-predicted frames and 3.78 dB when only B-frames are considered. Decoded videos exhibit significantly better visual quality as well.
  • Yayın
    Fast algorithm analysis and bit-serial architecture design for sub-pixel motion estimation in H.264
    (World Scientific Publishing Company, 2010-12) Fatemi, Mohammad Reza Hosseiny; Ateş, Hasan Fehmi; Salleh, Rosli Bin
    The sub-pixel motion estimation (SME), together with the interpolation of reference frames, is a computationally extensive part of the H.264 encoder that increases the memory requirement 16-times for each reference frame. Due to the huge computational complexity and memory requirement of the H.264 SME, its hardware architecture design is an important issue especially in high resolution or low power applications. To solve the above difficulties, we propose several optimization techniques in both algorithm and architecture levels. In the algorithm level, we propose a parabolic based algorithm for SME with quarter-pixel accuracy which reduces the computational budget by 94.35% and the memory access requirement by 98.5% in comparison to the standard interpolate and search method. In addition, a fast version of the proposed algorithm is presented that reduces the computational budget 46.28% further while maintaining the video quality. In the architecture level, we propose a novel bit-serial architecture for our algorithm. Due to advantages of the bit-serial architecture, it has a low gate count, high speed operation frequency, low density interconnection, and a reduced number of I/O pins. Also, several optimization techniques including the sum of absolute differences truncation, source sharing exploiting and power saving techniques are applied to the proposed architecture which reduce power consumption and area. Our design can save between 57.71-90.01% of area cost and improves the macroblock (MB) processing speed between 1.7-8.44 times when compared to previous designs. Implementation results show that our design can support real time HD1080 format with 20.3 k gate counts at the operation frequency of 144.9 MHz.
  • Yayın
    Improving semantic segmentation with generalized models of local context
    (Springer International Publishing AG, 2017) Ateş, Hasan Fehmi; Sünetci, Sercan
    Semantic segmentation (i.e. image parsing) aims to annotate each image pixel with its corresponding semantic class label. Spatially consistent labeling of the image requires an accurate description and modeling of the local contextual information. Superpixel image parsing methods provide this consistency by carrying out labeling at the superpixel-level based on superpixel features and neighborhood information. In this paper, we develop generalized and flexible contextual models for superpixel neighborhoods in order to improve parsing accuracy. Instead of using a fixed segmentation and neighborhood definition, we explore various contextual models to combine complementary information available in alternative superpixel segmentations of the same image. Simulation results on two datasets demonstrate significant improvement in parsing accuracy over the baseline approach.