1

Political Advertising Dataset: the use case of the Polish 2020 Presidential Elections

Political campaigns are full of political ads posted by candidates on social media. Political advertisements constitute a basic form of campaigning, subjected to various social requirements. We present the first publicly open dataset for detecting …

Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?

Automatic Speech Recognition (ASR) systems introduce word errors, which often confuse punctuation prediction models, turning punctuation restoration into a challenging task. These errors usually take the form of homonyms. We show how retrofitting of …

WER we are and WER we think we are

Natural language processing of conversational speech requires the availability of high-quality transcripts. In this paper, we express our skepticism towards the recent reports of very low Word Error Rates (WERs) achieved by modern Automatic Speech …

Aspect Detection using Word and Char Embeddings with (Bi) LSTM and CRF

Avaya Conversational Intelligence: A Real-Time System for Spoken Language Understanding in Human-Human Call Center Conversations

Extracting Aspects Hierarchies Using Rhetorical Structure Theory

We propose a novel approach to generate aspect hierarchies that proved to be consistently correct compared with human-generated hierarchies. We present an unsupervised technique using Rhetorical Structure Theory and graph analysis. We evaluated our …

Method for Aspect-Based Sentiment Annotation Using Rhetorical Analysis

Sentiment Analysis for Polish Using Transfer Learning Approach

A method for sentiment polarity assignment for textual content written in Polish using supervised machine learning approach with transfer learning scheme is proposed in the paper. It has been shown that performing simple natural language processing …