SemaMediaData Technologies

  • SemaMediaData provides cutting-edge technologies for multimedia analysis and retrieval. Particularly, the image OCR software achieved the top ranking in all three evaluation tasks of ICDAR 2011 robust reading competition for born digital images. From then on SemaMediaData scientists further improved the software performance and accuracy. On the other hand, SemaMediaData's solution realizes an efficient way of indexing and exploring lecture videos in large lecture video archives.

  • SemaMediaData OCR in the wild SemaMediaData's scene text recognition technology can be applied in the interactive scenarios.

  • Video OCR The figure above shows the main workflow of a video OCR framework, which provides some theoretical impression of the technology. We first use a video segmentation and/or frame selection method to generate a visual summarization of the video content, we then apply our image text recognition engine to retrieve textual metadata. Both, visual as well as textual metadata can be used for further indexing and the search process. The text recognition in video frames or images is much more challenging than that in scanned documents, which are mostly with nice contrast and uniform background.For extracting text from videos a set of sophisticated pre-processing procedures have to be utilized such as text localization and text segmentation. At the present stage SemaMediaData's video OCR technology is suited for processing overlay text and some scene text with enough contrast ratio within video frames and images.

  • Lecture video analysis SemaMediaData's lecture video retrieval solution provides an entire workflow for structural segmentation of lecture videos, video OCR analysis, automated lecture outline extraction from OCR transcripts, content-based keyword browsing and video search by using OCR and other analysis results.

  • Shot boundary detection SemaMediaData's Shot Boundary Detection (SBD) method aims at partitioning a video stream into various representative segments for efficient browsing and accessing of video content. In addition, the result of SBD is often used as a preprocessing step for other analysis methods, like key-frame selection, video OCR, and image/video concept detection etc. The main types of shot boundaries include hard cut, fade in/fade out, dissolve. Currently hard cut trasitions can be detected by SemaMediaData's SBD engine. More cut type will be supported in the near future.

  • Image/Video concept classification The facilitated semantic concepts within an image or a video scene provide valueable information for multimedia retrieval. With SemaMediaData's image concept detector, a video scene or an image can be classified into predefined concept classes as e.g., cars, football, skiing, indoor/outdoor etc. This classification results can be applied for image search or video indexing.

SemaMediaData Services

SemaMediaData is a startup company that offers online as well as offline multimedia analysis services based on cutting edge multimedia-processing technologies. SemaMediaData technologies serve as cornerstone for various innovative applications as e.g., automatic video annotation, video indexing, video browsing, content-based video search engine, social media mining etc.

Online Analysis

SemaMediaData offers several novel online multimedia analysis services which can either be accessed by using a REST API or our File-Upload-Analysis service.

  • Facebook's ad policy checker Do you want to check whether Facebook will approve your image as part of an advert? Choose an image from your PC and let us detect how much text that image contains. We will then tell you whether your image will pass Facebook's text to image ratio requirement (20% or less text) View Demo

  • Image OCR provides functionality for detecting and recognizing text content from video frames and images. It differs from a conventional print OCR engine which is designed for a high resolution scan of printed documents with uniform background. The Image OCR engine of SemaMediaData applies a set of sophisticated preprocessing procedures such as text localisation, background reparation, text binarization etc., which enables text recognition in a more complex environment. The actual version of image OCR is designed for detecting and recognizing overlay text and some of scene text (with sufficient contrast and horizontal alignment) from video frames and images. View Demo

  • Video OCR is an analysis cascade which includes automatied video segmentation, video text detection, recognition and named entity recognition (NER), which is a free add-on feature. The analysis result of this service enables automatic video retrieval and indexing as well as content-based video search in video portals and digital archives. A detailed example can be found on our demo website. View Demo

  • Video key-frame extraction also consists of a cascade of analysis procedures, including video key-frame extraction based on a user defined number of key-frames, video text recognition and NER (a free add-on feature). The biggest difference from the video OCR service is that it enables a rapid video summarization regardless of the video length by using a predefined number of key-frames. View Demo

  • Lecture video analysis engine is created for analyzing lecture recordings which are produced by capturing slides displayed on the computer screen during the lecture. The main idea of this approach is to capture the temporal scope of each unique slide from the video. The analysis process consists of slide transition detection, unique slide extraction, text recognition, lecture outline extraction from OCR text, and NER (a free add-on feature). Based on the analysis result various applications can be implemented for automatic lecture video indexing and browsing. View Demo

  • Video shot boundary detection is used for separating a video stream into a set of individual scenes by detecting camera transitions automatically. Based on the result the user can obtain a fast overview on the video content by browsing extracted key-frames from each video scene. Furthermore, with the corresponding time information the user can directly navigate to the expected video content. View Demo

Offline Services and Tools

SemaMediaData also offers offline services and tools addressing your specific requirements.

  • Customized video indexing and retrieval solution SemaMediaData can establish a local server in a partner's local systems providing the required media analysis technologies. The suitable application areas includes surveillance videos, IM videos, social media content retrieval, event recordings (conference, workshop), TV-programs, films, documentary recordings etc.

  • Consulting service for multimedia processing, automated content tagging, information retrieval. The SemaMediaData experts will help you to build innovative multimedia retrieval applications for large scale processing systems.

  • Offline Demo Tool Our Offline Demo Tool can be used for demonstrating the analysis result on your local PC. We provide you and HTML5 based solution that will statisfy your particular needs.

Trial Run

Since the accurary and hit-rate of a multimedia retrieval system might be affected by different factors such as heterogeneous backgrounds, scale and illumination changes, occlusions, various image resolutions and contrast, geometric distortions, compression artifacts etc. the analysis result will heavily depend on the quality of the supplied materials.

SemaMediaData provides the opportunity to verify whether the results satisfy the needs of each individual user. We offer 200 credit points for each new registration, which can be used for analyzing 25 minutes of video or 200 images for free. Furthermore, we also provide demonstration functionalities in the 'Task Manager' page, by which users can directly browse the result by using several indexing features.

Application Fields and Examples

SemaMediaData multimedia analysis technologies serve as cornerstone for various applications as e.g.,

  • Automatic video and image annotation
  • Automatic video indexing and browsing
  • Content-based video search engine
  • Social media mining, web data mining etc.

SemaMediaData technologies have been already successfully applied in several large research projects:

  • tele-TASK (tele-Teaching Anywhere Solution Kit) consists of an advanced mobile system for the production of Internet streaming videos, podcasts and a large lecture video portal. By applying SemaMediaData's video OCR technique the textual content information of the video is automatically extracted and provided for the search engine. The tele-TASK user can find the hits rate of a search term within the lecturer's speech and/or lecture slide. Other innovative applications based on SemaMediaData's technologies include automatic lecture slides extraction, lecture outline extraction and content-based keywords (from speech and OCR) browsing. The figure below shows the content-based lecture video search application in the tele-TASK portal, where the user can find search hits of terms within speech and video text. With this innovative feature the user can find interesting lecture topics easier, faster and view the topic segment conveniently by using a popup video player. An example lecture video can be found by following this link.

  • openHPI is the educational Internet platform of the German Hasso Plattner Institute, Potsdam. openHPI works according to the principle of “Massive Open Online Courses” (MOOC), the user takes part in a worldwide social learning network based on interactive online courses covering different subjects in Information and Communications Technology (ICT). SemaMediaData's technologies have been applied in openHPI for automated lecture slides extraction from videos. The figure below shows that the extracted lecture slides have been integrated into the openHPI video player .

  • MEDIAGLOBE - the digital archive is part of the THESEUS research program initiated by the German Federal Ministry of Economy and Technology (BMWi). MEDIAGLOBE deals with digitization, analysis, and semantic retrieval of historical, documentary audiovisual content. SemaMediaData's video OCR technology plays an important role for automatic textual metadata generation in the Mediaglobe AV-archive.


  • Personally assisted validation of identities and documents. SELFYDENT offers fast, reliable and secure technologies to identify individuals based on their identity cards via video conference. SemaMediaData's video OCR technology is integrated into the id-card validation process. This enables the machine to read the id document from the video stream automatically, which provides an objective checking result supplemented the human validator.