DigitalMR proposes to investigate methods of determining themes in collections of images
that accompany social media posts. The methodology is inspired by recent advances in deep
learning that have benefited from the availability of large training data sets along with
increased computation power through heterogeneous computing. The concept operates
unsupervised and leverages the “deep learning framework” to determine themes and establish
their relevance to brands or organisations using hierarchical structures. The concept also
assigns labels to identified themes (topics) and determines ways that describe the theme such
that it can be applied to market research and insight management tasks, such as sentiment and
semantic analysis. The target outcome of the project is to discover the potential and to reserve
the capability of theme detection in image collection for commercial applications. This
capability will ultimately enhance listening247, a social media listening and analytics system
(text based) whose effectiveness has been proven by a range of private and public sector
organisations. The core R&D tasks include methods for learning imbalanced datasets, deep
architecture selection, deep learning via dedicated classifiers and ensemble formulations. The
project will make use standard datasets for training and testing. The DigitalMR team will
benefit from systematic input by a specialist UK academic team (subcontractor) as well as
user feedback from corporate challenge partners.
DigitalMR proposes to investigate the feasibility of a large scale visual sentiment analysis system. The key goal of the project is to construct a high performance sentiment analysis engine that applies a mid-level representation to bridge the gap between semantic concepts and sentiment/emotion perception. Such systems are increasingly in demand by senior marketing executives who look for ways to include the growing number of image relates posts which account for around a third of all tweets. The project is aligned with the data exploration part of the competition and and in particular the subtopics: a) automated & intelligent data cleansing & semantic annotation and b) new algorithms and approaches to extract value and insight from complex data sets.
Automatic emotion detection in short text posts represents an exciting avenue for R&D at the
cross-section of text/data mining in social media. Considering that social media has no
geographic or time boundaries, people are posting comments from everywhere and while in
different emotional states. Therefore, organisations are interested to know the emotions of
their clients about their own brands and those of competitors. This project is about
establishing the potential benefits of augmenting listening247, the main software as a service
platform of DigitalMR, with emotion detection capabilities. The R&D tasks include machine
learning modelling of psycholinguistic phenomena. The project will produce dedicated large
datasets for training and testing. The DigitalMR team will benefit from regular input by a
scientific panel with experts in psychology and machine learning from UK Universities as
well as user feedback from three corporate challenge partners.
The Fareviz project aims to build innovative new digital services around a data warehouse of newly opened rail fares from RSP (part of the Association of Train operating companies). The project will offer analytics and visualisations around the 1.25bn possible rail fares as defined over permitted routes, with discount options and by different operators. This will enable government, regulators and travellers to benefit from an increased understanding of the highly complex and politically sensitive fares system through a new fares API designed for use by third parties in apps and services. The fare service will offer ticket fulfilment with social media feedback so that we can explore the use of the site to identify further services of interest to users
DigitalMR proposes to develop a real-time social media marketing monitoring system. It will combine data from corporate systems and millions of blogs, boards, videos and news from different social media sites to present complex information quickly and clearly. Such systems are increasingly in demand by senior marketing executives who look for ways to sift through fast changing data across geographies, languages and time zones. This feasibility study will utilise market -leading, multilingual technology owned by DigitalMR, namely eListen. It focuses on usability and knowing our user markets, tests solutions for real non-ICT specialist users. It combines data currently sitting in silos improving efficiencies that can enable scalable visualisation which will be market leading.
DigitalMR will work with cybersecurity experts to conduct critical security audits for its software as a service platforms.
GRD Development of Prototype
The evolution of the internet, social media networking and 24h multimedia channels means that there are many information sources from which various views about a subject such as brand, topic or product category may be drawn. The individual postings via online channels are numerous, meaning that a significant opportunity has emerged for companies to learn about their customers by monitoring online sources.
This project is about a prototype platform that brings together a number of state of the art technologies that aim for the first time to offer fast and accurate multilingual text analysis by combining statistical and linguistic based algorithms, Boolean logic and other methods to increase sentiment accuracy. Known linguistic systems can only add a new language with a lead time of 6 months. We will apply a combination of statistical algorithms that will allow us to set up a new language within 2-3 weeks. In addition, linguistic algorithms will be applied later in the process to enhance the overall accuracy of sentiment analysis. The project will establish efficient statistical analysis algorithms enhanced with machine learning capabilities, and will combine these with linguistic analysis algorithms and Boolean logic search strings to greatly enhance the currently available technologies. The combination will be used primarily for the prediction of accurate sentiment in text, but will also offer significant benefits for other important binary decisions, such as whether a comment contains a selling opportunity, or linguistic semantics such as irony or sarcasm.
The process of analysis is supplemented by an advanced reporting tool. The tool will be developed for both PCs and mobiles. The deliverable from this project will demonstrate successful development and deployment of active web listening, together with an extensive analysis to confirm the potential for translation into a viable commercial product with significant potential for UK exports over the next 5 years.