Engineering papers

[1] arXiv:2006.09332 [pdf]

Explorable Decoding of Compressed Images

Yuval Bahat, Tomer Michaeli

The ever-growing amounts of visual contents captured on a daily basis necessitate the use of lossy compression methods in order to save storage space and transmission bandwidth. While extensive research efforts are devoted to improving compression techniques, every method inevitably discards information. Especially at low bit rates, this information often corresponds to semantically meaningful visual cues, so that decompression involves significant ambiguity. In spite of this fact, existing decompression algorithms typically produce only a single output, and do not allow the viewer to explore the set of images that map to the given compressed code. Recently, explorable image restoration has been studied in the context of super-resolution. In this work, we propose to take this idea to the realm of image decompression. Specifically, we develop a novel deep-network based decoder architecture for the ubiquitous JPEG standard, which allows traversing the set of decompressed images that are consistent with the compressed input code. To allow for simple user interaction, we also develop a graphical user interface that comprises several intuitive exploration and editing tools. We exemplify our framework on graphical, medical and forensic use cases, demonstrating its wide range of potential applications.

[2] arXiv:2006.09306 [pdf]

Learning About Objects by Learning to Interact with Them

Martin Lohmann, Jordi Salvador, Aniruddha Kembhavi, Roozbeh Mottaghi

Much of the remarkable progress in computer vision has been focused around fully supervised learning mechanisms relying on highly curated datasets for a variety of tasks. In contrast, humans often learn about their world with little to no external supervision. Taking inspiration from infants learning from their environment through play and interaction, we present a computational framework to discover objects and learn their physical properties along this paradigm of Learning from Interaction. Our agent, when placed within the near photo-realistic and physics-enabled AI2-THOR environment, interacts with its world and learns about objects, their geometric extents and relative masses, without any external guidance. Our experiments reveal that this agent learns efficiently and effectively; not just for objects it has interacted with before, but also for novel instances from seen categories as well as novel object categories.

[3] arXiv:2006.09241 [pdf]

Two-Dimensional Non-Line-of-Sight Scene Estimation from a Single Edge Occluder

Sheila W. Seidel, John Murray-Bruce, Yanting Ma, Christopher Yu, William T. Freeman, Vivek K Goyal

Passive non-line-of-sight imaging methods are often faster and stealthier than their active counterparts, requiring less complex and costly equipment. However, many of these methods exploit motion of an occluder or the hidden scene, or require knowledge or calibration of complicated occluders. The edge of a wall is a known and ubiquitous occluding structure that may be used as an aperture to image the region hidden behind it. Light from around the corner is cast onto the floor forming a fan-like penumbra rather than a sharp shadow. Subtle variations in the penumbra contain a remarkable amount of information about the hidden scene. Previous work has leveraged the vertical nature of the edge to demonstrate 1D (in angle measured around the corner) reconstructions of moving and stationary hidden scenery from as little as a single photograph of the penumbra. In this work, we introduce a second reconstruction dimension range measured from the edge. We derive a new forward model, accounting for radial falloff, and propose two inversion algorithms to form 2D reconstructions from a single photograph of the penumbra. Performances of both algorithms are demonstrated on experimental data corresponding to several different hidden scene configurations. A Cramer-Rao bound analysis further demonstrates the feasibility (and utility) of the 2D corner camera.

[4] arXiv:2006.09233 [pdf]

Towards Deductive Verification of Control Algorithms for Autonomous Marine Vehicles

Simon Foster, Mario Gleirscher, Radu Calinescu

The use of autonomous vehicles in real-world applications is often precluded by the difficulty of providing safety guarantees for their complex controllers. The simulation-based testing of these controllers cannot deliver sufficient safety guarantees, and the use of formal verification is very challenging due to the hybrid nature of the autonomous vehicles. Our work-in-progress paper introduces a formal verification approach that addresses this challenge by integrating the numerical computation of such a system (in GNU/Octave) with its hybrid system verification by means of a proof assistant (Isabelle). To show the effectiveness of our approach, we use it to verify differential invariants of an Autonomous Marine Vehicle with a controller switching between multiple modes.

[5] arXiv:2006.09222 [pdf]

Towards Automated Assessment of Stuttering and Stuttering Therapy

Sebastian P. Bayerl, Florian Hönig, Joelle Reister, Korbinian Riedhammer

Stuttering is a complex speech disorder that can be identified by repetitions, prolongations of sounds, syllables or words, and blocks while speaking. Severity assessment is usually done by a speech therapist. While attempts at automated assessment were made, it is rarely used in therapy. Common methods for the assessment of stuttering severity include percent stuttered syllables (% SS), the average of the three longest stuttering symptoms during a speech task, or the recently introduced Speech Efficiency Score (SES). This paper introduces the Speech Control Index (SCI), a new method to evaluate the severity of stuttering. Unlike SES, it can also be used to assess therapy success for fluency shaping. We evaluate both SES and SCI on a new comprehensively labeled dataset containing stuttered German speech of clients prior to, during, and after undergoing stuttering therapy. Phone alignments of an automatic speech recognition system are statistically evaluated in relation to their relative position to labeled stuttering events. The results indicate that phone length distributions differ with respect to their position in and around labeled stuttering events

[6] arXiv:2006.09178 [pdf]

Policy Gradient-based Algorithms for Continuous-time Linear Quadratic Control

Jingjing Bu, Afshin Mesbahi, Mehran Mesbahi

We consider the continuous-time Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. The results developed are in parallel to those in Bu et al. [1] for discrete-time LTI systems. In this direction, we characterize several analytical properties (smoothness, coerciveness, quadratic growth) that are crucial in the analysis of gradient-based algorithms. We also point out similarities and distinctive features of the continuous time setup in comparison with its discrete time analogue. First, we examine three types of well-posed flows direct policy update for LQR gradient flow, natural gradient flow and the quasi-Newton flow. The coercive property of the corresponding cost function suggests that these flows admit unique solutions while the gradient dominated property indicates that the underling Lyapunov functionals decay at an exponential rate; quadratic growth on the other hand guarantees that the trajectories of these flows are exponentially stable in the sense of Lyapunov. We then discuss the forward Euler discretization of these flows, realized as gradient descent, natural gradient descent and quasi-Newton iteration. We present stepsize criteria for gradient descent and natural gradient descent, guaranteeing that both algorithms converge linearly to the global optima. An optimal stepsize for the quasi-Newton iteration is also proposed, guaranteeing a $Q$-quadratic convergence rate--and in the meantime--recovering the Kleinman-Newton iteration. Lastly, we examine LQR state feedback synthesis with a sparsity pattern. In this case, we develop the necessary formalism and insights for projected gradient descent, allowing us to guarantee a sublinear rate of convergence to a first-order stationary point.

[7] arXiv:2006.09170 [pdf]

Balanced truncation model reduction for symmetric second order systems -- A passivity-based approach

Ines Dorschky, Timo Reis, Matthias Voigt

We introduce a model reduction approach for linear time-invariant second order systems based on positive real balanced truncation. Our method guarantees asymptotic stability and passivity of the reduced order model as well as the positive definiteness of the mass and stiffness matrices. Moreover, we receive an a priori gap metric error bound. Finally, we show that our method based on positive real balanced truncation preserves the structure of overdamped second order systems.

[8] arXiv:2006.09145 [pdf]

End-to-End Inverse Design for Inverse Scattering via Freeform Metastructures

Zin Lin, Charles Roques-Carmes, Raphaël Pestourie, Marin Soljačić, Arka Majumdar, Steven G. Johnson

By co-designing a meta-optical frontend in conjunction with image processing backend, we demonstrate noise-robust subwavelength reconstruction of an image superior to an optics-only or computation-only approach. Our end-to-end inverse design couples the solution of the full Maxwell equations---exploiting all aspects of wave physics arising in subwavelength scatterers---with inverse-scattering algorithms in a single large-scale optimization involving $\gtrsim 10^4$ degrees of freedom. The resulting structures scatter light in a way that is radically different from either a conventional lens or a random microstructure, and suppress the noise sensitivity of the inverse-scattering computation by several orders of magnitude.

[9] arXiv:2006.09119 [pdf]

Query Intent Detection from the SEO Perspective

Samin Mohammadi, Mathieu Chapon, Arthur Fremond

Google users have different intents from their queries such as acquiring information, buying products, comparing or simulating services, looking for products, and so on. Understanding the right intention of users helps to provide i) better content on web pages from the Search Engine Optimization (SEO) perspective and ii) more user-satisfying results from the search engine perspective. In this study, we aim to identify the user query's intent by taking advantage of Google results and machine learning methods. Our proposed approach is a clustering model that exploits some features to detect query's intent. A list of keywords extracted from the clustered queries is used to identify the intent of a new given query. Comparing the clustering results with the intents predicted by filtered keywords show the efficiency of the extracted keywords for detecting intents.

[10] arXiv:2006.08270 [pdf]

How to design cell-mediated self-assembled colloidal scaffolds

C. S. Dias, C. A. Custodio, G. C. Antunes, M. M. Telo da Gama, J. F. Mano, N. A. M. Araujo

A critical step in tissue engineering is the design and synthesis of 3D biocompatible matrices (scaffolds) to support and guide the proliferation of cells and tissue growth. Most existing techniques rely on the processing of scaffolds under controlled conditions and then implanting them \textit{in vivo}, with questions related to biocompatibility and the implantation process that are still challenging. As an alternative, it was proposed to assemble the scaffolds \textit{in loco} through the self-organization of colloidal particles mediated by cells. In this study, we combine experiments, particle-based simulations, and mean-field calculations to show that, in general, the size of the self-assembled scaffold scales with the cell-to-particle ratio. However, we found an optimal value of this ratio, for which the size of the scaffold is maximal when cell-cell adhesion is suppressed. These results suggest that the size and structure of the self-assembled scaffolds may be designed by tuning the adhesion between cells in the colloidal suspension.

[11] arXiv:2006.08215 [pdf]

Joint Optimization of the Deployment and Resource Allocation of UAVs in Vehicular Edge Computing and Networks

Yuke Zheng, Bo Yang, Cailian Chen

With the development of smart vehicles, computing-intensive tasks are widely and rapidly generated. To alleviate the burden of on-board CPU, connected vehicles can offload tasks to or make request from nearby edge server thanks to the emerging Mobile Edge Computing (MEC). However, such approach may sharply increase the workload of an edge server, and cause network congestion, especially in rural and mountain areas where there are few edge servers. To this end, a UAV-assisted MEC system is proposed in this paper, and joint optimization algorithm of the deployment and resource allocation of UAVs (JOAoDR) is proposed to decide the location and balance the resource and rewards of the UAVs. We solve a long-term profit maximization problem in terms of the operator. Numerical results demonstrated that our algorithm outperforms other benchmarks algorithm, and validated our solution.

[12] arXiv:2006.08044 [pdf]

Survey on Physical Layer Security for 5G Wireless Networks

José David Vega Sánchez, Luis Urquiza-Aguiar, Martha Cecilia Paredes Paredes, Diana Pamela Moya Osorio

Physical layer security is a promising approach that can benefit traditional encryption methods. The idea of physical layer security is to take advantage of the features of the propagation medium and its impairments to ensure secure communication in the physical layer. This work introduces a comprehensive review of the main information-theoretic metrics used to measure the secrecy performance in physical layer security. Furthermore, a theoretical framework related to the most commonly used physical layer security techniques to improve the secrecy performance is provided. Finally, our work surveys physical layer security research over several enabling 5G technologies, such as massive multiple-input multiple-output, millimeter-wave communications, heterogeneous networks, non-orthogonal multiple access, and full-duplex. Also, we include the key concepts of each of the aforementioned technologies. Future fields of research and technical challenges of physical layer security are also identified.

[13] arXiv:2006.08024 [pdf]

Short-Range Ambient Backscatter Communication Using Reconfigurable Intelligent Surfaces

Mahyar Nemati, Jie Ding, Jinho Choi

Ambient backscatter communication (AmBC) has been introduced to address communication and power efficiency issues for short-range and low-power Internet-of-Things (IoT) applications. On the other hand, reconfigurable intelligent surface (RIS) has been recently proposed as a promising approach that can control the propagation environment especially in indoor communication environments. In this paper, we propose a new AmBC model over ambient orthogonal-frequency-division-multiplexing (OFDM) subcarriers in the frequency domain in conjunction with RIS for short-range communication scenarios. A tag transmits one bit per each OFDM subcarrier broadcasted from a WiFi access point. Then, RIS augments the signal quality at a reader by compensating the phase distortion effect of multipath channel on the incident signal. We also exploit the special spectrum structure of OFDM to transmit more data over its squeezed orthogonal subcarriers in the frequency domain. Consequently, the proposed method improves the bit-error-rate (BER) performance and provides a higher data rate compared to existing AmBC methods. Analytical and numerical evaluations show the superior performance of the proposed approach in terms of BER and data rate.

[14] arXiv:2006.07931 [pdf]

Solos A Dataset for Audio-Visual Music Analysis

Juan F. Montesinos, Olga Slizovskaia, Gloria Haro

In this paper, we present a new dataset of music performance videos which can be used for training machine learning methods for multiple tasks such as audio-visual blind source separation and localization, cross-modal correspondences, cross-modal generation and, in general, any audio-visual selfsupervised task. These videos, gathered from YouTube, consist of solo musical performances of 13 different instruments. Compared to previously proposed audio-visual datasets, Solos is cleaner since a big amount of its recordings are auditions and manually checked recordings, ensuring there is no background noise nor effects added in the video post-processing. Besides, it is, up to the best of our knowledge, the only dataset that contains the whole set of instruments present in the URMP [1] dataset, a highquality dataset of 44 multi-instrument audio-visual recordings of classical music pieces with individual audio tracks. URMP was intented to be used for source separation, thus, we evaluate the performance on the URMP dataset of two different BSS models trained on Solos

[15] arXiv:2006.07919 [pdf]

Vehicle Redistribution in Ride-Sourcing Markets using Convex Minimum Cost Flows

Renos Karamanis, Eleftherios Anastasiadis, Marc Stettler, Panagiotis Angeloudis

Ride-sourcing platforms often face imbalances in the demand and supply of rides across areas in their operating road-networks. As such, dynamic pricing methods have been used to mediate these demand asymmetries through surge price multipliers, thus incentivising higher driver participation in the market. However, the anticipated commercialisation of autonomous vehicles could transform the current ride-sourcing platforms to fleet operators. The absence of human drivers fosters the need for empty vehicle management to address any vehicle supply deficiencies. Proactive redistribution using integer programming and demand predictive models have been proposed in research to address this problem. A shortcoming of existing models, however, is that they ignore the market structure and underlying customer choice behaviour. As such, current models do not capture the real value of redistribution. To resolve this, we formulate the vehicle redistribution problem as a non-linear minimum cost flow problem which accounts for the relationship of supply and demand of rides, by assuming a customer discrete choice model and a market structure. We demonstrate that this model can have a convex domain, and we introduce an edge splitting algorithm to solve a transformed convex minimum cost flow problem for vehicle redistribution. By testing our model using simulation, we show that our redistribution algorithm can decrease wait times up to 50% and increase vehicle utilization up to 8%. Our findings outline that the value of redistribution is contingent on localised market structure and customer behaviour.

[16] arXiv:2006.07907 [pdf]

Trajectory Generation by Chance Constrained Nonlinear MPC with Probabilistic Prediction

Xiaoxue Zhang, Jun Ma, Zilong Cheng, Sunan Huang, Shuzhi Sam Ge, Tong Heng Lee

Continued great efforts have been dedicated towards high-quality trajectory generation based on optimization methods, however, most of them do not suitably and effectively consider the situation with moving obstacles; and more particularly, the future position of these moving obstacles in the presence of uncertainty within some possible prescribed prediction horizon. To cater to this rather major shortcoming, this work shows how a variational Bayesian Gaussian mixture model (vBGMM) framework can be employed to predict the future trajectory of moving obstacles; and then with this methodology, a trajectory generation framework is proposed which will efficiently and effectively address trajectory generation in the presence of moving obstacles, and also incorporating presence of uncertainty within a prediction horizon. In this work, the full predictive conditional probability density function (PDF) with mean and covariance is obtained, and thus a future trajectory with uncertainty is formulated as a collision region represented by a confidence ellipsoid. To avoid the collision region, chance constraints are imposed to restrict the collision probability, and subsequently a nonlinear MPC problem is constructed with these chance constraints. It is shown that the proposed approach is able to predict the future position of the moving obstacles effectively; and thus based on the environmental information of the probabilistic prediction, it is also shown that the timing of collision avoidance can be earlier than the method without prediction. The tracking error and distance to obstacles of the trajectory with prediction are smaller compared with the method without prediction.

[17] arXiv:2006.07898 [pdf]

The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge

Ashish Arora, Desh Raj, Aswin Shanmugam Subramanian, Ke Li, Bar Ben-Yair, Matthew Maciejewski, Piotr Żelasko, Paola García, Shinji Watanabe, Sanjeev Khudanpur

This paper summarizes the JHU team's efforts in tracks 1 and 2 of the CHiME-6 challenge for distant multi-microphone conversational speech diarization and recognition in everyday home environments. We explore multi-array processing techniques at each stage of the pipeline, such as multi-array guided source separation (GSS) for enhancement and acoustic model training data, posterior fusion for speech activity detection, PLDA score fusion for diarization, and lattice combination for automatic speech recognition (ASR). We also report results with different acoustic model architectures, and integrate other techniques such as online multi-channel weighted prediction error (WPE) dereverberation and variational Bayes-hidden Markov model (VB-HMM) based overlap assignment to deal with reverberation and overlapping speakers, respectively. As a result of these efforts, our ASR systems achieve a word error rate of 40.5% and 67.5% on tracks 1 and 2, respectively, on the evaluation set. This is an improvement of 10.8% and 10.4% absolute, over the challenge baselines for the respective tracks.

[18] arXiv:2006.07778 [pdf]

Cascaded deep monocular 3D human pose estimation with evolutionary training data

Shichao Li, Lei Ke, Kevin Pratama, Yu-Wing Tai, Chi-Keung Tang, Kwang-Ting Cheng

End-to-end deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation, yet these models may fail for unseen poses with limited and fixed training data. This paper proposes a novel data augmentation method that (1) is scalable for synthesizing massive amount of training data (over 8 million valid 3D human poses with corresponding 2D projections) for training 2D-to-3D networks, (2) can effectively reduce dataset bias. Our method evolves a limited dataset to synthesize unseen 3D human skeletons based on a hierarchical human representation and heuristics inspired by prior knowledge. Extensive experiments show that our approach not only achieves state-of-the-art accuracy on the largest public benchmark, but also generalizes significantly better to unseen and rare poses. Relevant files and tools are available at the project website.

[19] arXiv:2006.07694 [pdf]

Sensorless Freehand 3D Ultrasound Reconstruction via Deep Contextual Learning

Hengtao Guo, Sheng Xu, Bradford Wood, Pingkun Yan

Transrectal ultrasound (US) is the most commonly used imaging modality to guide prostate biopsy and its 3D volume provides even richer context information. Current methods for 3D volume reconstruction from freehand US scans require external tracking devices to provide spatial position for every frame. In this paper, we propose a deep contextual learning network (DCL-Net), which can efficiently exploit the image feature relationship between US frames and reconstruct 3D US volumes without any tracking device. The proposed DCL-Net utilizes 3D convolutions over a US video segment for feature extraction. An embedded self-attention module makes the network focus on the speckle-rich areas for better spatial movement prediction. We also propose a novel case-wise correlation loss to stabilize the training process for improved accuracy. Highly promising results have been obtained by using the developed method. The experiments with ablation studies demonstrate superior performance of the proposed method by comparing against other state-of-the-art methods. Source code of this work is publicly available at this https URL.

[20] arXiv:2006.07496 [pdf]

Optimal Time-Domain Sinusoidal Pulse Width Modulation Technique

Siddharth Tyagi, Isaak Mayergoyz

An optimal time-domain pulse width modulation technique is presented for single-phase and three-phase inverters under the constraint of sinusoidally modulated voltage pulse widths. Harmonic content in currents and voltages is expressed as function of displacement factors, which characterize the placement of the sinusoidally modulated voltage pulses in each sampling subinterval. Implications of symmetries on these displacement factors for three-phase inverters are discussed. Minimization of harmonics is stated as an optimization problem, which is then numerically solved to reveal improvements in harmonic performance.

[21] arXiv:2006.07351 [pdf]

PMD-Tolerant 20 krad/s Endless Polarization and Phase Control for BB84-Based QKD with TDM Pilot Signals

Benjamin Koch, Reinhold Noe

TDM-based polarization and differential phase control with 35ps PMD tolerance and 20krad/s tracking speed is demonstrated. 600ns intervals are reserved for QKD and for 0°- and 45°-polarized pilot signals. ECLs are modulated directly, with high extinction. Power budget is 17dB, fiber length is 63km.

[22] arXiv:2006.07327 [pdf]

GNN3DMOT Graph Neural Network for 3D Multi-Object Tracking with Multi-Feature Learning

Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani

3D Multi-object tracking (MOT) is crucial to autonomous systems. Recent work uses a standard tracking-by-detection pipeline, where feature extraction is first performed independently for each object in order to compute an affinity matrix. Then the affinity matrix is passed to the Hungarian algorithm for data association. A key process of this standard pipeline is to learn discriminative features for different objects in order to reduce confusion during data association. In this work, we propose two techniques to improve the discriminative feature learning for MOT (1) instead of obtaining features for each object independently, we propose a novel feature interaction mechanism by introducing the Graph Neural Network. As a result, the feature of one object is informed of the features of other objects so that the object feature can lean towards the object with similar feature (i.e., object probably with a same ID) and deviate from objects with dissimilar features (i.e., object probably with different IDs), leading to a more discriminative feature for each object; (2) instead of obtaining the feature from either 2D or 3D space in prior work, we propose a novel joint feature extractor to learn appearance and motion features from 2D and 3D space simultaneously. As features from different modalities often have complementary information, the joint feature can be more discriminate than feature from each individual modality. To ensure that the joint feature extractor does not heavily rely on one modality, we also propose an ensemble training paradigm. Through extensive evaluation, our proposed method achieves state-of-the-art performance on KITTI and nuScenes 3D MOT benchmarks. Our code will be made available at this https URL

[23] arXiv:2006.07310 [pdf]

Reservoir Computing meets Recurrent Kernels and Structured Transforms

Jonathan Dong, Ruben Ohana, Mushegh Rafayelyan, Florent Krzakala

Reservoir Computing is a class of simple yet efficient Recurrent Neural Networks where internal weights are fixed at random and only a linear output layer is trained. In the large size limit, such random neural networks have a deep connection with kernel methods. Our contributions are threefold a) We rigorously establish the recurrent kernel limit of Reservoir Computing and prove its convergence. b) We test our models on chaotic time series prediction, a classic but challenging benchmark in Reservoir Computing, and show how the Recurrent Kernel is competitive and computationally efficient when the number of data points remains moderate. c) When the number of samples is too large, we leverage the success of structured Random Features for kernel approximation by introducing Structured Reservoir Computing. The two proposed methods, Recurrent Kernel and Structured Reservoir Computing, turn out to be much faster and more memory-efficient than conventional Reservoir Computing.

[24] arXiv:2006.07152 [pdf]

Move-to-Data A new Continual Learning approach with Deep CNNs, Application for image-class recognition

Miltiadis Poursanidis (LaBRI), Jenny Benois-Pineau (LaBRI), Akka Zemmari (LaBRI), Boris Mansenca (LaBRI), Aymar de Rugy (INCIA)

In many real-life tasks of application of supervised learning approaches, all the training data are not available at the same time. The examples are lifelong image classification or recognition of environmental objects during interaction of instrumented persons with their environment, enrichment of an online-database with more images. It is necessary to pre-train the model at a "training recording phase" and then adjust it to the new coming data. This is the task of incremental/continual learning approaches. Amongst different problems to be solved by these approaches such as introduction of new categories in the model, refining existing categories to sub-categories and extending trained classifiers over them, ... we focus on the problem of adjusting pre-trained model with new additional training data for existing categories. We propose a fast continual learning layer at the end of the neuronal network. Obtained results are illustrated on the opensource CIFAR benchmark dataset. The proposed scheme yields similar performances as retraining but with drastically lower computational cost.

[25] arXiv:2006.07137 [pdf]

STONNE A Detailed Architectural Simulator for Flexible Neural Network Accelerators

Francisco Muñoz-Martínez, José L. Abellán, Manuel E. Acacio, Tushar Krishna

The design of specialized architectures for accelerating the inference procedure of Deep Neural Networks (DNNs) is a booming area of research nowadays. First-generation rigid proposals have been rapidly replaced by more advanced flexible accelerator architectures able to efficiently support a variety of layer types and dimensions. As the complexity of the designs grows, it is more and more appealing for researchers to have cycle-accurate simulation tools at their disposal to allow for fast and accurate design-space exploration, and rapid quantification of the efficacy of architectural enhancements during the early stages of a design. To this end, we present STONNE (Simulation TOol of Neural Network Engines), a cycle-accurate, highly-modular and highly-extensible simulation framework that enables end-to-end evaluation of flexible accelerator architectures running complete contemporary DNN models. We use STONNE to model the recently proposed MAERI architecture and show how it can closely approach the performance results of the publicly available BSV-coded MAERI implementation. Then, we conduct a comprehensive evaluation and demonstrate that the folding strategy implemented for MAERI results in very low compute unit utilization (25% on average across 5 DNN models) which in the end translates into poor performance.

[26] arXiv:2006.07115 [pdf]

Simulating Tariff Impact in Electrical Energy Consumption Profiles with Conditional Variational Autoencoders

Margaux Brégère, Ricardo J. Bessa

The implementation of efficient demand response (DR) programs for household electricity consumption would benefit from data-driven methods capable of simulating the impact of different tariffs schemes. This paper proposes a novel method based on conditional variational autoencoders (CVAE) to generate, from an electricity tariff profile combined with exogenous weather and calendar variables, daily consumption profiles of consumers segmented in different clusters. First, a large set of consumers is gathered into clusters according to their consumption behavior and price-responsiveness. The clustering method is based on a causality model that measures the effect of a specific tariff on the consumption level. Then, daily electrical energy consumption profiles are generated for each cluster with CVAE. This non-parametric approach is compared to a semi-parametric data generator based on generalized additive models and that uses prior knowledge of energy consumption. Experiments in a publicly available data set show that, the proposed method presents comparable performance to the semi-parametric one when it comes to generating the average value of the original data. The main contribution from this new method is the capacity to reproduce rebound and side effects in the generated consumption profiles. Indeed, the application of a special electricity tariff over a time window may also affect consumption outside this time window. Another contribution is that the clustering approach segments consumers according to their daily consumption profile and elasticity to tariff changes. These two results combined are very relevant for an ex-ante testing of future DR policies by system operators, retailers and energy regulators.

[27] arXiv:2006.07033 [pdf]

Scattering medium randomly packed pinhole cameras

Honglin Liu, Xin Wang, Puxiang Lai, Zhentao Liu, Jianhong Shi, Shensheng Han

When light travels through scattering media, speckles (spatially random distribution of fluctuated intensities) are formed due to the interference of light travelling along different optical paths, preventing the perception of structure, absolute location and dimension of a target within or on the other side of the medium. Currently, the prevailing techniques such as wavefront shaping, optical phase conjugation, scattering matrix measurement, and speckle autocorrelation imaging can only picture the target structure in the absence of prior information. Here we show that a scattering medium can be conceptualized as an assembly of randomly packed pinhole cameras, and the corresponding speckle pattern is a superposition of randomly shifted pinhole images. This provides a new perspective to bridge target, scattering medium, and speckle pattern, allowing one to localize and profile a target quantitatively from speckle patterns perceived from the other side of the scattering medium, which is impossible with all existing methods. The method also allows us to interpret some phenomena of diffusive light that are otherwise challenging to understand. For example, why the morphological appearance of speckle patterns changes with the target, why information is difficult to be extracted from thick scattering media, and what determines the capability of seeing through scattering media. In summary, the concept, whilst in its infancy, opens a new door to unveiling scattering media and information extraction from scattering media in real time.

[28] arXiv:2006.06943 [pdf]

A Drone-based Networked System and Methods for Combating Coronavirus Disease (COVID-19) Pandemic

Adarsh Kumar, Kriti Sharma, Harvinder Singh, Sagar Gupta Naugriya, Sukhpal Singh Gill, Rajkumar Buyya

Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. It is similar to influenza viruses and raises concerns through alarming levels of spread and severity resulting in an ongoing pandemic world-wide. Within five months (by May 2020), it infected 5.89 million persons world-wide and over 357 thousand have died. Drones or Unmanned Aerial Vehicles (UAVs) are very helpful in handling the COVID-19 pandemic. This work investigates the drone-based systems, COVID-19 pandemic situations, and proposes architecture for handling pandemic situations in different scenarios using real-time and simulation-based case studies. The proposed architecture uses wearable sensors to record the observations in Body Area Networks (BANs) in a push-pull data fetching mechanism. The proposed architecture is found to be useful in remote and highly congested pandemic areas where either the wireless or Internet connectivity is a major issue or chances of COVID-19 spreading are high. It collects and stores the substantial amount of data in a stipulated period and helps to take appropriate action as and when required. In real-time drone-based healthcare system implementation for COVID-19 operations, it is observed that a large area can be covered for sanitization, thermal image collection, patient identification etc. within a short period (2 KMs within 10 minutes approx.) through aerial route. In the simulation, the same statistics are observed with an addition of collision-resistant strategies working successfully for indoor and outdoor healthcare operations.

[29] arXiv:2006.06940 [pdf]

Neural voice cloning with a few low-quality samples

Sunghee Jung, Hoirin Kim

In this paper, we explore the possibility of speech synthesis from low quality found data using only limited number of samples of target speaker. We try to extract only the speaker embedding from found data of target speaker unlike previous works which tries to train the entire text-to-speech system on found data. Also, the two speaker mimicking approaches which are adaptation and speaker-encoder-based are applied on newly released LibriTTS dataset and previously released VCTK corpus to examine the impact of speaker variety on clarity and target-speaker-similarity .

[30] arXiv:2006.06937 [pdf]

Non-parallel voice conversion based on source-to-target direct mapping

Sunghee Jung, Youngjoo Suh, Yeunju Choi, Hoirin Kim

Recent works of utilizing phonetic posteriograms (PPGs) for non-parallel voice conversion have significantly increased the usability of voice conversion since the source and target DBs are no longer required for matching contents. In this approach, the PPGs are used as the linguistic bridge between source and target speaker features. However, this PPG-based non-parallel voice conversion has some limitation that it needs two cascading networks at conversion time, making it less suitable for real-time applications and vulnerable to source speaker intelligibility at conversion stage. To address this limitation, we propose a new non-parallel voice conversion technique that employs a single neural network for direct source-to-target voice parameter mapping. With this single network structure, the proposed approach can reduce both conversion time and number of network parameters, which can be especially important factors in embedded or real-time environments. Additionally, it improves the quality of voice conversion by skipping the phone recognizer at conversion stage. It can effectively prevent possible loss of phonetic information the PPG-based indirect method suffers. Experiments show that our approach reduces number of network parameters and conversion time by 41.9% and 44.5%, respectively, with improved voice similarity over the original PPG-based method.

[31] arXiv:2006.06932 [pdf]

Personalized Demand Response via Shape-Constrained Online Learning

Ana M. Ospina, Andrea Simonetto, Emiliano Dall'Anese

This paper formalizes a demand response task as an optimization problem featuring a known time-varying engineering cost and an unknown (dis)comfort function. Based on this model, this paper develops a feedback-based projected gradient method to solve the demand response problem in an online fashion, where i) feedback from the user is leveraged to learn the (dis)comfort function concurrently with the execution of the algorithm; and, ii) measurements of electrical quantities are used to estimate the gradient of the known engineering cost. To learn the unknown function, a shape-constrained Gaussian Process is leveraged; this approach allows one to obtain an estimated function that is strongly convex and smooth. The performance of the online algorithm is analyzed by using metrics such as the tracking error and the dynamic regret. A numerical example is illustrated to corroborate the technical findings.

[32] arXiv:2006.06728 [pdf]

Deep Reinforcement Learning for Electric Transmission Voltage Control

Brandon L. Thayer, Thomas J. Overbye

Today, human operators primarily perform voltage control of the electric transmission system. As the complexity of the grid increases, so does its operation, suggesting additional automation could be beneficial. A subset of machine learning known as deep reinforcement learning (DRL) has recently shown promise in performing tasks typically performed by humans. This paper applies DRL to the transmission voltage control problem, presents open-source DRL environments for voltage control, proposes a novel modification to the "deep Q network" (DQN) algorithm, and performs experiments at scale with systems up to 500 buses. The promise of applying DRL to voltage control is demonstrated, though more research is needed to enable DRL-based techniques to consistently outperform conventional methods.

[33] arXiv:2006.06660 [pdf]

Strain engineering in single-, bi- and tri-layer MoS2, MoSe2, WS2 and WSe2

Felix Carrascoso, Hao Li, Riccardo Frisenda, Andres Castellanos-Gomez

Strain is a powerful tool to modify the optical properties of semiconducting transition metal dichalcogenides like MoS2, MoSe2, WS2 and WSe2. In this work we provide a thorough description of the technical details to perform uniaxial strain measurements on these two-dimensional semiconductors and we provide a straightforward calibration method to determine the amount of applied strain with high accuracy. We then employ reflectance spectroscopy to analyze the strain tunability of the electronic properties of single-, bi- and tri-layer MoS2, MoSe2, WS2 and WSe2. Finally, we quantify the flake-to-flake variability by analyzing 15 different single-layer MoS2 flakes.

[34] arXiv:2006.06617 [pdf]

Microheater actuators as a versatile platform for strain engineering in 2D materials

Yu Kyoung Ryu, Felix Carrascoso, Rubén López-Nebreda, Nicolás Agraït, Riccardo Frisenda, Andres Castellanos-Gomez

We present microfabricated thermal actuators to engineer the biaxial strain in two-dimensional (2D) materials. These actuators are based on microheater circuits patterned onto the surface of a polymer with a high thermal expansion coefficient. By running current through the microheater one can vary the temperature of the polymer and induce a controlled biaxial expansion of its surface. This controlled biaxial expansion can be transduced to biaxial strain to 2D materials, placed onto the polymer surface, which in turn induces a shift of the optical spectrum. Our thermal strain actuators can reach a maximum biaxial strain of 0.64 % and they can be modulated at frequencies up to 8 Hz. The compact geometry of these actuators results in a negligible spatial drift of 0.03 um/deg, which facilitates their integration in optical spectroscopy measurements. We illustrate the potential of this strain engineering platform to fabricate a strain-actuated optical modulator with single-layer MoS2.

[35] arXiv:2006.06531 [pdf]

A Comparative Study of U-Net Topologies for Background Removal in Histopathology Images

Abtin Riasatian, Maral Rasoolijaberi, Morteza Babaei, H.R. Tizhoosh

During the last decade, the digitization of pathology has gained considerable momentum. Digital pathology offers many advantages including more efficient workflows, easier collaboration as well as a powerful venue for telepathology. At the same time, applying Computer-Aided Diagnosis (CAD) on Whole Slide Images (WSIs) has received substantial attention as a direct result of the digitization. The first step in any image analysis is to extract the tissue. Hence, background removal is an essential prerequisite for efficient and accurate results for many algorithms. In spite of the obvious discrimination for human operators, the identification of tissue regions in WSIs could be challenging for computers, mainly due to the existence of color variations and artifacts. Moreover, some cases such as alveolar tissue types, fatty tissues, and tissues with poor staining are difficult to detect. In this paper, we perform experiments on U-Net architecture with different network backbones (different topologies) to remove the background as well as artifacts from WSIs in order to extract the tissue regions. We compare a wide range of backbone networks including MobileNet, VGG16, EfficientNet-B3, ResNet50, ResNext101 and DenseNet121. We trained and evaluated the network on a manually labeled subset of The Cancer Genome Atlas (TCGA) Dataset. EfficientNet-B3 and MobileNet by almost 99% sensitivity and specificity reached the best results.

[36] arXiv:2006.06277 [pdf]

W-net Simultaneous segmentation of multi-anatomical retinal structures using a multi-task deep neural network

Hongwei Zhao, Chengtao Peng, Lei Liu, Bin Li

Segmentation of multiple anatomical structures is of great importance in medical image analysis. In this study, we proposed a $\mathcal{W}$-net to simultaneously segment both the optic disc (OD) and the exudates in retinal images based on the multi-task learning (MTL) scheme. We introduced a class-balanced loss and a multi-task weighted loss to alleviate the imbalanced problem and to improve the robustness and generalization property of the $\mathcal{W}$-net. We demonstrated the effectiveness of our approach by applying five-fold cross-validation experiments on two public datasets e\_ophtha\_EX and DiaRetDb1. We achieved F1-score of 94.76\% and 95.73\% for OD segmentation, and 92.80\% and 94.14\% for exudates segmentation. To further prove the generalization property of the proposed method, we applied the trained model on the DRIONS-DB dataset for OD segmentation and on the MESSIDOR dataset for exudate segmentation. Our results demonstrated that by choosing the optimal weights of each task, the MTL based $\mathcal{W}$-net outperformed separate models trained individually on each task. Code and pre-trained models will be available at \url{this https URL}.

[37] arXiv:2006.06261 [pdf]

XiaoiceSing A High-Quality and Integrated Singing Voice Synthesis System

Peiling Lu, Jie Wu, Jian Luan, Xu Tan, Li Zhou

This paper presents XiaoiceSing, a high-quality singing voice synthesis system which employs an integrated network for spectrum, F0 and duration modeling. We follow the main architecture of FastSpeech while proposing some singing-specific design 1) Besides phoneme ID and position encoding, features from musical score (e.g.note pitch and length) are also added. 2) To attenuate off-key issues, we add a residual connection in F0 prediction. 3) In addition to the duration loss of each phoneme, the duration of all the phonemes in a musical note is accumulated to calculate the syllable duration loss for rhythm enhancement. Experiment results show that XiaoiceSing outperforms the baseline system of convolutional neural networks by 1.44 MOS on sound quality, 1.18 on pronunciation accuracy and 1.38 on naturalness respectively. In two A/B tests, the proposed F0 and duration modeling methods achieve 97.3% and 84.3% preference rate over baseline respectively, which demonstrates the overwhelming advantages of XiaoiceSing.

[38] arXiv:2006.06209 [pdf]

High-Performance Perovskite Photodetectors Based on CH3NH3PbBr3 Quantum Dot/TiO2 Heterojunction

Rajeev Ray, Nagaraju Nakka, Suman Kalyan Pal

Organo-lead halide perovskite materials have opened up a great opportunity to develop high performance photodetectors because of their superior optoelectronic properties. The main issue with perovskite-only photodetector is severe carrier recombination. Integration of perovskite with high-conductive materials such as graphene or transition metal sulfides certainly improved the photoresponsivity. However, achieving high overall performance remains a challenge. Here, an improved photodetector is constructed by perovskite quantum dots (QDs) and atomic layer deposited (ALD) ultrathin TiO2 films. The designed CH3NH3PbBr3 QD/TiO2 bilayer device displays inclusive performance with on/off ratio of 6.3x102, responsivity of 85 AW-1, and rise/decay time of 0.09/0.11 s. Furthermore, we demonstrate that interface plays a crucial role in determining the device current and enhance the overall performance of heterostructure photodetectors through interface engineering. We believe that this work can provide a strategy to accelerate development of high-performance solution-processed perovskite photodetectors.

[39] arXiv:2006.06186 [pdf]

Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification

Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng

Recently adversarial attacks on automatic speaker verification (ASV) systems attracted widespread attention as they pose severe threats to ASV systems. However, methods to defend against such attacks are limited. Existing approaches mainly focus on retraining ASV systems with adversarial data augmentation. Also, countermeasure robustness against different attack settings are insufficiently investigated. Orthogonal to prior approaches, this work proposes to defend ASV systems against adversarial attacks with a separate detection network, rather than augmenting adversarial data into ASV training. A VGG-like binary classification detector is introduced and demonstrated to be effective on detecting adversarial samples. To investigate detector robustness in a realistic defense scenario where unseen attack settings exist, we analyze various attack settings and observe that the detector is robust (6.27\% EER_{det} degradation in the worst case) against unseen substitute ASV systems, but it has weak robustness (50.37\% EER_{det} degradation in the worst case) against unseen perturbation methods. The weak robustness against unseen perturbation methods shows a direction for developing stronger countermeasures.

[40] arXiv:2006.06134 [pdf]

Kalman Filter Based Multiple Person Head Tracking

Mohib Ullah, Maqsood Mahmud, Habib Ullah, Kashif Ahmad, Ali Shariq Imran, Faouzi Alaya Cheikh

For multi-target tracking, target representation plays a crucial rule in performance. State-of-the-art approaches rely on the deep learning-based visual representation that gives an optimal performance at the cost of high computational complexity. In this paper, we come up with a simple yet effective target representation for human tracking. Our inspiration comes from the fact that the human body goes through severe deformation and inter/intra occlusion over the passage of time. So, instead of tracking the whole body part, a relative rigid organ tracking is selected for tracking the human over an extended period of time. Hence, we followed the tracking-by-detection paradigm and generated the target hypothesis of only the spatial locations of heads in every frame. After the localization of head location, a Kalman filter with a constant velocity motion model is instantiated for each target that follows the temporal evolution of the targets in the scene. For associating the targets in the consecutive frames, combinatorial optimization is used that associates the corresponding targets in a greedy fashion. Qualitative results are evaluated on four challenging video surveillance dataset and promising results has been achieved.

[41] arXiv:2006.06077 [pdf]

S-semantics -- an example

Włodzimierz Drabent

The s-semantics makes it possible to explicitly deal with variables in program answers. So it seems suitable for programs using nonground data structures, like open lists. However it is difficult to find examples of using the s-semantics to reason about particular programs. Here we apply s-semantics to prove correctness and completeness of Frühwirth's $n$ queens program. This is compared with a proof, published elsewhere, based on the standard semantics and Herbrand interpretations.

[42] arXiv:2006.05974 [pdf]

Provably robust verification of dissipativity properties from data

Anne Koch, Julian Berberich, Frank Allgöwer

Dissipativity properties have proven to be very valuable for systems analysis and controller design. With the rising amount of available data, there has therefore been an increasing interest in determining dissipativity properties from (measured) trajectories directly, while an explicit model of the system remains undisclosed. Most existing approaches for data-driven dissipativity, however, guarantee the dissipativity condition only over a finite time horizon and provide weak or no guarantees on robustness in the presence of noise. In this paper, we present a framework for verifying dissipativity properties from measured data with desirable guarantees. We first consider the case of input-state measurements, where we provide non-conservative and computationally attractive conditions even in the presence of noise. We then extend this approach to input-output data, where similar results hold in the noise-free case.

[43] arXiv:2006.05961 [pdf]

Model-Free Algorithm and Regret Analysis for MDPs with Long-Term Constraints

Qinbo Bai, Vaneet Aggarwal, Ather Gattami

In the optimization of dynamical systems, the variables typically have constraints. Such problems can be modeled as a constrained Markov Decision Process (CMDP). This paper considers a model-free approach to the problem, where the transition probabilities are not known. In the presence of long-term (or average) constraints, the agent has to choose a policy that maximizes the long-term average reward as well as satisfy the average constraints in each episode. The key challenge with the long-term constraints is that the optimal policy is not deterministic in general, and thus standard Q-learning approaches cannot be directly used. This paper uses concepts from constrained optimization and Q-learning to propose an algorithm for CMDP with long-term constraints. For any $\gamma\in(0,\frac{1}{2})$, the proposed algorithm is shown to achieve $O(T^{1/2+\gamma})$ regret bound for the obtained reward and $O(T^{1-\gamma/2})$ regret bound for the constraint violation, where $T$ is the total number of steps. We note that these are the first results on regret analysis for MDP with long-term constraints, where the transition probabilities are not known apriori.

[44] arXiv:2006.05952 [pdf]

Strongly parity-mixed superconductivity in Rashba-Hubbard model

Kosuke Nogaki, Youichi Yanase

Heterostructures containing strongly correlated electron systems provide a platform to clarify interplay of electron correlation and Rashba spin-orbit coupling in unconventional superconductors. Motivated by recent fabrication of artificially-engineered heavy fermion superlattices and high-temperature cuprate superconductors, we conduct a thorough study on superconductivity in Rashba-Hubbard model. In contrast to previous weak coupling approaches, we employ fluctuation-exchange approximation to describe quantum critical magnetic fluctuations and resulting superconductivity. As a result, robust Fermi surfaces against magnetic fluctuations, incommensurate spin fluctuations, and a strongly parity-mixed superconducting phase are demonstrated in a wide range of electron filling from type-II van Hove singularity to half-filling. We also clarify impacts of type-II van Hove singularity on magnetic fluctuations and superconductivity. Whereas the $d_{x^2-y^2}$-wave pairing always dominant, subdominant spin-triplet pairing with either $p$-wave or $f$-wave symmetry shows a comparable magnitude, especially near the type-II van Hove singularity. Our results resolve unsettled issues on strongly correlated Rashba systems and uncover candidate systems of nonreciprocal transport and topological superconductivity.

[45] arXiv:2006.05932 [pdf]

A nonlinear model of vortex-induced forces on an oscillating cylinder in a fluid flow

Jan Decuyper, Tim De Troyer, Koen Tiels, Johan Schoukens, Mark C. Runacres

A nonlinear model relating the imposed motion of a circular cylinder, submerged in a fluid flow, to the transverse force coefficient is presented. The nonlinear fluid system, featuring vortex shedding patterns, limit cycle oscillations and synchronisation, is studied both for swept sine and multisine excitation. A nonparametric nonlinear distortion analysis (FAST) is used to distinguish odd from even nonlinear behaviour. The information which is obtained from the nonlinear analysis is explicitly used in constructing a nonlinear model of the polynomial nonlinear state-space (PNLSS) type. The latter results in a reduction of the number of parameters and an increased accuracy compared to the generic modelling approach where typically no such information of the nonlinearity is used. The obtained model is able to accurately simulate time series of the transverse force coefficient over a wide range of the frequency-amplitude plane of imposed cylinder motion.

[46] arXiv:2006.05923 [pdf]

Cross-Sensor Adversarial Domain Adaptation of Landsat-8 and Proba-V images for Cloud Detection

Gonzalo Mateo-García, Valero Laparra, Dan López-Puigdollers, Luis Gómez-Chova

The number of Earth observation satellites carrying optical sensors with similar characteristics is constantly growing. Despite their similarities and the potential synergies among them, derived satellite products are often developed for each sensor independently. Differences in retrieved radiances lead to significant drops in accuracy, which hampers knowledge and information sharing across sensors. This is particularly harmful for machine learning algorithms, since gathering new ground truth data to train models for each sensor is costly and requires experienced manpower. In this work, we propose a domain adaptation transformation to reduce the statistical differences between images of two satellite sensors in order to boost the performance of transfer learning models. The proposed methodology is based on the Cycle Consistent Generative Adversarial Domain Adaptation (CyCADA) framework that trains the transformation model in an unpaired manner. In particular, Landsat-8 and Proba-V satellites, which present different but compatible spatio-spectral characteristics, are used to illustrate the method. The obtained transformation significantly reduces differences between the image datasets while preserving the spatial and spectral information of adapted images, which is hence useful for any general purpose cross-sensor application. In addition, the training of the proposed adversarial domain adaptation model can be modified to improve the performance in a specific remote sensing application, such as cloud detection, by including a dedicated term in the cost function. Results show that, when the proposed transformation is applied, cloud detection models trained in Landsat-8 data increase cloud detection accuracy in Proba-V.

[47] arXiv:2006.05902 [pdf]

Q-greedyUCB a New Exploration Policy for Adaptive and Resource-efficient Scheduling

Yu Zhao, Joohyun Lee, Wei Chen

This paper proposes a learning algorithm to find a scheduling policy that achieves an optimal delay-power trade-off in communication systems. Reinforcement learning (RL) is used to minimize the expected latency for a given energy constraint where the environments such as traffic arrival rates or channel conditions can change over time. For this purpose, this problem is formulated as an infinite-horizon Markov Decision Process (MDP) with constraints. To handle the constrained optimization problem, we adopt the Lagrangian relaxation technique to solve it. Then, we propose a variant of Q-learning, Q-greedyUCB that combines Q-learning for \emph{average} reward algorithm and Upper Confidence Bound (UCB) policy to solve this decision-making problem. We prove that the Q-greedyUCB algorithm is convergent through mathematical analysis. Simulation results show that Q-greedyUCB finds an optimal scheduling strategy, and is more efficient than Q-learning with the $\varepsilon$-greedy and Average-payoff RL algorithm in terms of the cumulative reward (i.e., the weighted sum of delay and energy) and the convergence speed. We also show that our algorithm can reduce the regret by up to 12% compared to the Q-learning with the $\varepsilon$-greedy and Average-payoff RL algorithm.

[48] arXiv:2006.05852 [pdf]

Intelligent User Clustering and Robust Beamforming Design for UAV-NOMA Downlink

Yanqing Xu, Fang Fang, Donghong Cai, Yi Yuan

In this work, we consider a downlink NOMA network with multiple single-antenna users and multi-antenna UAVs. In particular, the users are spatially located in several clusters by following the Poisson Cluster Process and each cluster is served by a hovering UAV with NOMA. For practical considerations, we assume that only imperfect CSI of each user is available at the UAVs. Based on this model, the problem of joint user clustering and robust beamforming design is formulated to minimize the sum transmission power, and meanwhile, guarantee the QoS requirements of users. Due to the integer variables of user clustering, coupling effects of beamformers, and infinitely many constraints caused by the imperfect CSI, the formulated problem is challenging to solve. For computational complexity reduction, the original problem is divided into user clustering subproblem and robust beamforming design subproblem. By utilizing the users' position information, we propose a k-means++ based unsupervised clustering algorithm to first deal with the user clustering problem. Then, we focus on the robust beamforming design problem. To attain insights on solving the robust beamforming design problem, we firstly investigate the problem with perfect CSI, and the associated problem is shown can be solved optimally. Secondly, for the problem in the general case with imperfect CSI, an SDR based method is proposed to produce a suboptimal solution efficiently. Moreover, we provide a sufficient condition under which the SDR based approach can guarantee to obtain an optimal rank-one solution, which is theoretically analyzed. Finally, an alternating direction method of multipliers based algorithm is proposed to allow the UAVs to perform robust beamforming design in a decentralized fashion efficiently. Simulation results demonstrate the efficacy of the proposed algorithms and transmission scheme.

[49] arXiv:2006.05841 [pdf]

X-ray Monochromatic Imaging from Single-spectrum CT via Machine

Wenxiang Cong, Ge Wang

The conventional computed tomography (CT) with single energy spectrum only reconstructs effective linear attenuation coefficients, obtaining average spectral CT images, basically discarding x-ray energy-dependent information, which cannot be applied for the material identification because different materials may have the same CT value. Dual-energy CT (DECT) is a well-established technique, allowing monochromatic imaging and material decomposition. However, DECT requires two distinct x-ray energy spectra to generate two spectrally-different projection datasets. Generally it would increase radiation dose, system complexity, and equipment cost relative to single-spectrum CT. In this paper, a machine-learning-based CT reconstruction method is proposed to perform monochromatic image reconstruction using clinical CT scanner. This method establishes a residual neural network (ResNet) model to map average spectral CT images to monochromatic images at pre-specified energy level via deep learning. This ResNet is trained based on clinical dual energy dataset, showing an excellent convergence and stability. Testing data demonstrate the trained ResNet produces high quality monochromatic images with a relative error of less than 0.2%. The resultant x-ray monochromatic imaging can be applied for material differentiation, tissue characterization, and proton therapy treatment planning.

[50] arXiv:2006.05837 [pdf]

Narrowband vacuum ultraviolet light via cooperative Raman scattering in dual-pumped gas-filled photonic crystal fiber

Rinat Tyumenev (1), Philip St.J. Russell (1 and 2), David Novoa (1) ((1) Max Planck Institute for the Science of Light, (2) Department of Physics, Friedrich-Alexander-University)

Many fields such as bio-spectroscopy and photochemistry often require sources of vacuum ultraviolet (VUV) pulses featuring a narrow linewidth and tunable over a wide frequency range. However, the majority of available VUV light sources do not simultaneously fulfill those two requirements, and few if any are truly compact, cost-effective and easy to use by non-specialists. Here we introduce a novel approach that goes a long way to meeting this challenge. It is based on hydrogen-filled hollow-core photonic crystal fiber pumped simultaneously by two spectrally distant pulses. Stimulated Raman scattering enables the generation of coherence waves of collective molecular motion in the gas, which together with careful dispersion engineering and control over the modal content of the pump light, facilitates cooperation between the two separate Raman combs, resulting in a spectrum that reaches deep into the VUV. Using this system, we demonstrate the generation of a dual Raman comb of narrowband lines extending down to 141 nm using only 100 mW of input power delivered by a commercial solid-state laser. The approach may enable access to tunable VUV light to any laboratory and therefore boost progress in many research areas across multiple disciplines.

[51] arXiv:2006.05767 [pdf]

Electronic properties of type-II GaAs$_{1-x}$Sb$_{x}$/GaAs quantum rings for applications in intermediate-band solar cells

Reza Arkani, Christopher A. Broderick, Eoin P. O'Reilly

We present a theoretical analysis of the electronic properties of type-II GaAs$_{1-x}$Sb$_{x}$/GaAs quantum rings (QRs), from the perspective of applications in intermediate band solar cells (IBSCs). We outline the analytical solution of Schrödinger's equation for a cylindrical QR of infinite potential depth, and describe the evolution of the QR ground state with QR morphology. Having used this analytical model to elucidate general aspects of the electronic properties of QRs, we undertake multi-band $\textbf{k} \cdot \textbf{p}$ calculations -- including strain and piezoelectric effects -- for realistic GaAs$_{1-x}$Sb$_{x}$/GaAs QRs. Our $\textbf{k} \cdot \textbf{p}$ calculations confirm that the large type-II band offsets in GaAs$_{1-x}$Sb$_{x}$/GaAs QRs provide strong confinement of holes, and further indicate the presence of resonant (quasi-bound) electron states which localise in the centre of the QR. From the perspective of IBSC design the calculated electronic properties demonstrate several benefits, including (i) large hole ionisation energies, mitigating thermionic emission from the intermediate band, and (ii) electron-hole spatial overlaps exceeding those in conventional GaAs$_{1-x}$Sb$_{x}$/GaAs QDs, with the potential to engineer these overlaps via the QR morphology so as to manage the trade-off between optical absorption and radiative recombination. Overall, our analysis highlights the flexibility offered by the QR geometry from the perspective of band structure engineering, and identifies specific combinations of QR alloy composition and morphology which offer optimised sub-band gap energies for QR-based IBSCs.

[52] arXiv:2006.05757 [pdf]

Data science on industrial data -- Today's challenges in brown field applications

Tilman Klaeger, Sebastian Gottschall, Lukas Oehm

Much research is done on data analytics and machine learning. In industrial processes large amounts of data are available and many researchers are trying to work with this data. In practical approaches one finds many pitfalls restraining the application of modern technologies especially in brown field applications. With this paper we want to show state of the art and what to expect when working with stock machines in the field. A major focus in this paper is on data collection which can be more cumbersome than most people might expect. Also data quality for machine learning applications is a challenge once leaving the laboratory. In this area one has to expect the lack of semantic description of the data as well as very little ground truth being available for training and verification of machine learning models. A last challenge is IT security and passing data through firewalls.

[53] arXiv:2006.05754 [pdf]

Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus

Luisa Bentivogli, Beatrice Savoldi, Matteo Negri, Mattia Antonino Di Gangi, Roldano Cattoni, Marco Turchi

Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines. This difficulty is also due to the fact that the training data on which models are built typically reflect the asymmetries of natural languages, gender bias included. Exclusively fed with textual data, machine translation is intrinsically constrained by the fact that the input sentence does not always contain clues about the gender identity of the referred human entities. But what happens with speech translation, where the input is an audio signal? Can audio provide additional information to reduce gender bias? We present the first thorough investigation of gender bias in speech translation, contributing with i) the release of a benchmark useful for future studies, and ii) the comparison of different technologies (cascade and end-to-end) on two language directions (English-Italian/French).

[54] arXiv:2006.05753 [pdf]

On the Influence of Noise in Randomized Consensus Algorithms

Renato Vizuete, Paolo Frasca, Elena Panteley

In this paper we study the influence of additive noise in randomized consensus algorithms. Assuming that the update matrices are symmetric, we derive a closed form expression for the mean square error induced by the noise, together with upper and lower bounds that are simpler to evaluate. Motivated by the study of Open Multi-Agent Systems, we concentrate on Randomly Induced Discretized Laplacians, a family of update matrices that are generated by sampling subgraphs of a large undirected graph. For these matrices, we express the bounds by using the eigenvalues of the Laplacian matrix of the underlying graph or the graph's average effective resistance, thereby proving their tightness. Finally, we derive expressions for the bounds on some examples of graphs and numerically evaluate them.

[55] arXiv:2006.05747 [pdf]

Uniphore's submission to Fearless Steps Challenge Phase-2

Karthik Pandia D S, Cosimo Spera

We propose supervised systems for speech activity detection (SAD) and speaker identification (SID) tasks in Fearless Steps Challenge Phase-2. The proposed systems for both the tasks share a common convolutional neural network (CNN) architecture. Mel spectrogram is used as features. For speech activity detection, the spectrogram is divided into smaller overlapping chunks. The network is trained to recognize the chunks. The network architecture and the training steps used for the SID task are similar to that of the SAD task, except that longer spectrogram chunks are used. We propose a two-level identification method for SID task. First, for each chunk, a set of speakers is hypothesized based on the neural network posterior probabilities. Finally, the speaker identity of the utterance is identified using the chunk-level hypotheses by applying a voting rule. On SAD task, a detection cost function score of 5.96%, and 5.33% are obtained on dev and eval sets, respectively. A top 5 retrieval accuracy of 82.07% and 82.42% are obtained on the dev and eval sets for SID task. A brief analysis is made on the results to provide insights into the miss-classified cases in both the tasks.

[56] arXiv:2006.05712 [pdf]

Listen to What You Want Neural Network-based Universal Sound Selector

Tsubasa Ochiai, Marc Delcroix, Yuma Koizumi, Hiroaki Ito, Keisuke Kinoshita, Shoko Araki

Being able to control the acoustic events (AEs) to which we want to listen would allow the development of more controllable hearable devices. This paper addresses the AE sound selection (or removal) problems, that we define as the extraction (or suppression) of all the sounds that belong to one or multiple desired AE classes. Although this problem could be addressed with a combination of source separation followed by AE classification, this is a sub-optimal way of solving the problem. Moreover, source separation usually requires knowing the maximum number of sources, which may not be practical when dealing with AEs. In this paper, we propose instead a universal sound selection neural network that enables to directly select AE sounds from a mixture given user-specified target AE classes. The proposed framework can be explicitly optimized to simultaneously select sounds from multiple desired AE classes, independently of the number of sources in the mixture. We experimentally show that the proposed method achieves promising AE sound selection performance and could be generalized to mixtures with a number of sources that are unseen during training.

[57] arXiv:2006.05708 [pdf]

Image reconstruction through a multimode fiber with a simple neural network architecture

Changyan Zhu, Eng Aik Chan, You Wang, Weina Peng, Ruixiang Guo, Baile Zhang, Cesare Soci, Yidong Chong

Multimode fibers (MMFs) have the potential to carry complex images for endoscopy and related applications, but decoding the complex speckle patterns produced by mode-mixing and modal dispersion in MMFs is a serious open problem. Several groups have recently shown that convolutional neural networks (CNNs) can be trained to perform high-fidelity MMF image reconstruction. We find that a considerably simpler neural network architecture, the single hidden layer dense neural network, outperforms previously-used CNNs in terms of image reconstruction fidelity and training time. The performance of the trained neural network persists for hours after the cessation of the training set.

[58] arXiv:2006.05694 [pdf]

HiFi-GAN High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Jiaqi Su, Zeyu Jin, Adam Finkelstein

Real-world audio recordings are often degraded by factors such as noise, reverberation, and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to transform recorded speech to sound as though it had been recorded in a studio. We use an end-to-end feed-forward WaveNet architecture, trained with multi-scale adversarial discriminators in both the time domain and the time-frequency domain. It relies on the deep feature matching losses of the discriminators to improve the perceptual quality of enhanced speech. The proposed model generalizes well to new speakers, new speech content, and new environments. It significantly outperforms state-of-the-art baseline methods in both objective and subjective experiments.

[59] arXiv:2006.05678 [pdf]

A framework for modeling interdependencies among households, businesses, and infrastructure systems; and their response to disruptions

Mateusz Iwo Dubaniowski, Hans R. Heinimann

Urban systems, composed of households, businesses, and infrastructures, are continuously evolving and expanding. This has several implications because the impacts of disruptions, and the complexity and interdependence of systems, are rapidly increasing. Hence, we face a challenge in how to improve our understanding about the interdependencies among those entities, as well as their responses to disruptions. The aims of this study were to (1) create an agent that mimics the metabolism of a business or household that obtains supplies from and provides output to infrastructure systems; (2) implement a network of agents that exchange resources, as coordinated with a price mechanism; and (3) test the responses of this prototype model to disruptions. Our investigation resulted in the development of a business/household agent and a dynamically self-organizing mechanism of network coordination under disruption based on costs for production and transportation. Simulation experiments confirmed the feasibility of this new model for analyzing responses to disruptions. Among the nine disruption scenarios considered, in line with our expectations, the one combining the failures of infrastructure links and production processes had the most negative impact. We also identified areas for future research that focus on network topologies, mechanisms for resource allocation, and disruption generation.

[60] arXiv:2006.05674 [pdf]

3D geometric moment invariants from the point of view of the classical invariant theory

Leonid Bedratyuk

The aim of this paper is to clear up the problem of the connection between the 3D geometric moments invariants and the invariant theory, considering a problem of describing of the 3D geometric moments invariants as a problem of the classical invariant theory. Using the remarkable fact that the groups $SO(3)$ and $SL(2)$ are locally isomorphic, we reduced the problem of deriving 3D geometric moments invariants to the well-known problem of the classical invariant theory. We give a precise statement of the 3D geometric invariant moments computation, introducing the notions of the algebras of simultaneous 3D geometric moment invariants, and prove that they are isomorphic to the algebras of joint $SL(2)$-invariants of several binary forms. To simplify the calculating of the invariants we proceed from an action of Lie group $SO(3)$ to an action of its Lie algebra $\mathfrak{sl}_2$. The author hopes that the results will be useful to the researchers in the fields of image analysis and pattern recognition.

[61] arXiv:2006.05669 [pdf]

Interpretable Multimodal Learning for Intelligent Regulation in Online Payment Systems

Shuoyao Wang, Diwei Zhu

With the explosive growth of transaction activities in online payment systems, effective and realtime regulation becomes a critical problem for payment service providers. Thanks to the rapid development of artificial intelligence (AI), AI-enable regulation emerges as a promising solution. One main challenge of the AI-enabled regulation is how to utilize multimedia information, i.e., multimodal signals, in Financial Technology (FinTech). Inspired by the attention mechanism in nature language processing, we propose a novel cross-modal and intra-modal attention network (CIAN) to investigate the relation between the text and transaction. More specifically, we integrate the text and transaction information to enhance the text-trade jointembedding learning, which clusters positive pairs and push negative pairs away from each other. Another challenge of intelligent regulation is the interpretability of complicated machine learning models. To sustain the requirements of financial regulation, we design a CIAN-Explainer to interpret how the attention mechanism interacts the original features, which is formulated as a low-rank matrix approximation problem. With the real datasets from the largest online payment system, WeChat Pay of Tencent, we conduct experiments to validate the practical application value of CIAN, where our method outperforms the state-of-the-art methods.

[62] arXiv:2006.05659 [pdf]

Optimal Participation of Price-maker Battery Energy Storage Systems in Energy and Ancillary Services Markets Considering Degradation Cost

Reza Khalilisenobari, Meng Wu

This paper proposes a bi-level optimization framework to coordinate the operation of price-maker battery energy storage systems (BESSs) in real-time energy, reserve, and pay as performance regulation markets. The framework models both BESS's bidding strategies and the system operator's market clearing process, which makes it possible to assess the optimal allocation of BESS's services across various markets and study the impact of BESS's operation on energy and ancillary services markets. The BESS's strategic biding model is equipped with an accurate degradation cost function for the batteries. Based upon a comprehensive model for the frequency regulation market, an automatic generation control (AGC) signal dispatch model is proposed to deploy AGC signals in the bi-level framework. This enables detailed studies for BESS's operating patterns in the frequency regulation market. Case studies using a synthetic system built upon real-world data are performed to evaluate the proposed framework and study the interactions between BESS's profit maximization activities and wholesale market operations. Sensitivity studies are performed to investigate the impact of BESS's capacity and replacement cost on its revenue from energy and ancillary services markets.

[63] arXiv:2006.05647 [pdf]

Stochastic Gradient Descent for Semilinear Elliptic Equations with Uncertainties

Ting Wang, Jaroslaw Knap

Randomness is ubiquitous in modern engineering. The uncertainty is often modeled as random coefficients in the differential equations that describe the underlying physics. In this work, we describe a two-step framework for numerically solving semilinear elliptic partial differential equations with random coefficients 1) reformulate the problem as a functional minimization problem based on the direct method of calculus of variation; 2) solve the minimization problem using the stochastic gradient descent method. We provide the convergence criterion for the resulted stochastic gradient descent algorithm and discuss some useful technique to overcome the issues of ill-conditioning and large variance. The accuracy and efficiency of the algorithm are demonstrated by numerical experiments.

[64] arXiv:2006.05627 [pdf]

A survey on deep hashing for image retrieval

Xiaopeng Zhang

Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.

[65] arXiv:2006.05613 [pdf]

Agent Programming for Industrial Applications Some Advantages and Drawbacks

Otávio Arruda Matoso, Luis P. A. Lampert, Jomi Fred Hübner, Mateus Conceição, Sérgio P. Bernardes, Cleber Jorge Amaral, Maicon R. Zatelli, Marcelo L. de Lima

Autonomous agents are seen as a prominent technology to be applied in industrial scenarios. Classical automation solutions are struggling with challenges related to high dynamism, prompt actuation, heterogeneous entities, including humans, and decentralised decision-making. Besides promoting concepts, languages, and tools to face such challenges, agents must also provide high reliability. To assess how appropriate and mature are agents for industrial applications, we have investigated its application in two scenarios of the gas and oil industry. This paper presents the development of systems and the initial results highlighting the advantages and drawbacks of the agents approach when compared with the existing automation solutions.

[66] arXiv:2006.05598 [pdf]

Max-Min Optimal Beamforming for Cell-Free Massive MIMO

Andong Zhou, Jingxian Wu, Erik G. Larsson, Pingzhi Fan

This letter develops an optimum beamforming method for downlink transmissions in cell-free massive multiple-input multiple-output (MIMO) systems, which employ a massive number of distributed access points to provide concurrent services to multiple users. The optimum design is formulated as a max-min problem that maximizes the minimum signal-to-interference-plus-noise ratio of all users. It is shown analytically that the problem is quasi-concave, and the optimum solution is obtained with the second-order cone programming. The proposed method identifies the best achievable beamforming performance in cell-free massive MIMO systems. The results can be used as benchmarks for the design of practical low complexity beamformers.

[67] arXiv:2006.05583 [pdf]

Variational Optimization for the Submodular Maximum Coverage Problem

Jian Du, Zhigang Hua, Shuang Yang

We examine the \emph{submodular maximum coverage problem} (SMCP), which is related to a wide range of applications. We provide the first variational approximation for this problem based on the Nemhauser divergence, and show that it can be solved efficiently using variational optimization. The algorithm alternates between two steps (1) an E step that estimates a variational parameter to maximize a parameterized \emph{modular} lower bound; and (2) an M step that updates the solution by solving the local approximate problem. We provide theoretical analysis on the performance of the proposed approach and its curvature-dependent approximate factor, and empirically evaluate it on a number of public data sets and several application tasks.

[68] arXiv:2006.05579 [pdf]

Deep reinforcement learning for optical systems A case study of mode-locked lasers

Chang Sun, Eurika Kaiser, Steven L. Brunton, J. Nathan Kutz

We demonstrate that deep reinforcement learning (deep RL) provides a highly effective strategy for the control and self-tuning of optical systems. Deep RL integrates the two leading machine learning architectures of deep neural networks and reinforcement learning to produce robust and stable learning for control. Deep RL is ideally suited for optical systems as the tuning and control relies on interactions with its environment with a goal-oriented objective to achieve optimal immediate or delayed rewards. This allows the optical system to recognize bi-stable structures and navigate, via trajectory planning, to optimally performing solutions, the first such algorithm demonstrated to do so in optical systems. We specifically demonstrate the deep RL architecture on a mode-locked laser, where robust self-tuning and control can be established through access of the deep RL agent to its waveplates and polarizers. We further integrate transfer learning to help the deep RL agent rapidly learn new parameter regimes and generalize its control authority. Additionally, the deep RL learning can be easily integrated with other control paradigms to provide a broad framework to control any optical system.

[69] arXiv:2006.05575 [pdf]

Deep Learning-based Aerial Image Segmentation with Open Data for Disaster Impact Assessment

Ananya Gupta, Simon Watson, Hujun Yin

Satellite images are an extremely valuable resource in the aftermath of natural disasters such as hurricanes and tsunamis where they can be used for risk assessment and disaster management. In order to provide timely and actionable information for disaster response, in this paper a framework utilising segmentation neural networks is proposed to identify impacted areas and accessible roads in post-disaster scenarios. The effectiveness of pretraining with ImageNet on the task of aerial image segmentation has been analysed and performances of popular segmentation models compared. Experimental results show that pretraining on ImageNet usually improves the segmentation performance for a number of models. Open data available from OpenStreetMap (OSM) is used for training, forgoing the need for time-consuming manual annotation. The method also makes use of graph theory to update road network data available from OSM and to detect the changes caused by a natural disaster. Extensive experiments on data from the 2018 tsunami that struck Palu, Indonesia show the effectiveness of the proposed framework. ENetSeparable, with 30% fewer parameters compared to ENet, achieved comparable segmentation results to that of the state-of-the-art networks.

[70] arXiv:2006.05567 [pdf]

Wideband Collaborative Spectrum Sensing using Massive MIMO Decision Fusion

I. Dey, D. Ciuonzo, P. Salvo Rossi

In this paper, in order to tackle major challenges of spectrum exploration \& allocation in Cognitive Radio (CR) networks, we apply the general framework of Decision Fusion (DF) to wideband collaborative spectrum sensing based on Orthogonal Frequency Division Multiplexing (OFDM) reporting. At the transmitter side, we employ OFDM without Cyclic Prefix (CP) in order to improve overall bandwidth efficiency of the reporting phase in networks with high user density. On the other hand, at the receiver side (of the reporting channel) we device the Time-Reversal Widely Linear (TR-WL), Time-Reversal Maximal Ratio Combining (TR-MRC) and modified TR-MRC (TR-mMRC) rules for DF. The DF Center (DFC) is assumed to be equipped with a large antenna array, serving a number of unauthorized users competing for the spectrum, thereby resulting in a ``virtual'' massive Multiple-Input Multiple-Output (MIMO) channel. The effectiveness of the proposed TR-based rules in combating ($a$) inter-symbol and ($b$) inter-carrier interference over conventional (non-TR) counterparts is then examined, as a function of the Signal-to-Interference-plus-Noise Ratio (SINR). Closed-form performance, in terms of system false-alarm and detection probabilities, is derived for the formulated fusion rules. Finally, the impact of large-scale channel effects on the proposed fusion rules is also investigated, via Monte-Carlo simulations.

[71] arXiv:2006.05555 [pdf]

On the Trade-offs between Coverage Radius, Altitude and Beamwidth for Practical UAV Deployments

Haneya Naeem Qureshi, Ali Imran

Current studies on Unmanned Aerial Vehicle (UAV) based cellular deployment consider UAVs as aerial base stations for air-to-ground communication. However, they analyze UAV coverage radius and altitude interplay while omitting or over-simplifying an important aspect of UAV deployment, i.e., effect of a realistic antenna pattern. This paper addresses the UAV deployment problem while using a realistic 3D directional antenna model. New trade-offs between UAV design space dimensions are revealed and analyzed in different scenarios. The sensitivity of coverage area to both antenna beamwidth and height is compared. The analysis is extended to multiple UAVs and a new packing scheme is proposed for multiple UAVs coverage that offers several advantages compared to prior approaches.

[72] arXiv:2006.05547 [pdf]

Deep Adversarial Koopman Model for Reaction-Diffusion systems

Kaushik Balakrishnan, Devesh Upadhyay

Reaction-diffusion systems are ubiquitous in nature and in engineering applications, and are often modeled using a non-linear system of governing equations. While robust numerical methods exist to solve them, deep learning-based reduced ordermodels (ROMs) are gaining traction as they use linearized dynamical models to advance the solution in time. One such family of algorithms is based on Koopman theory, and this paper applies this numerical simulation strategy to reaction-diffusion systems. Adversarial and gradient losses are introduced, and are found to robustify the predictions. The proposed model is extended to handle missing training data as well as recasting the problem from a control perspective. The efficacy of these developments are demonstrated for two different reaction-diffusion problems (1) the Kuramoto-Sivashinsky equation of chaos and (2) the Turing instability using the Gray-Scott model.

[73] arXiv:2006.05544 [pdf]

Resolution-Enhanced MRI-Guided Navigation of Spinal Cellular Injection Robot

Daniel Enrique Martinez, Waiman Meinhold, John Oshinski, Ai-Ping Hu, Jun Ueda

This paper presents a method of navigating a surgical robot beyond the resolution of magnetic resonance imaging (MRI) by using a resolution enhancement technique enabled by high-precision piezoelectric actuation. The surgical robot was specifically designed for injecting stem cells into the spinal cord. This particular therapy can be performed in a shorter time by using a MRI-compatible robotic platform than by using a manual needle positioning platform. Imaging resolution of fiducial markers attached to the needle guide tubing was enhanced by reconstructing a high-resolution image from multiple images with sub-pixel movements of the robot. The parallel-plane direct-drive needle positioning mechanism positioned the needle guide with a high spatial precision that is two orders of magnitude higher than typical MRI resolution up to 1 mm. Reconstructed resolution enhanced images were used to navigate the robot precisely that would not have been possible by using standard MRI. Experiments were conducted to verify the effectiveness of the proposed enhanced-resolution image-guided intervention.

[74] arXiv:2006.05542 [pdf]

Guidelines for the Search Strategy to Update Systematic Literature Reviews in Software Engineering

Claes Wohlin, Emilia Mendes, Katia Romero Felizardo, Marcos Kalinowski

Context Systematic Literature Reviews (SLRs) have been adopted within Software Engineering (SE) for more than a decade to provide meaningful summaries of evidence on several topics. Many of these SLRs are now potentially not fully up-to-date, and there are no standard proposals on how to update SLRs in SE. Objective The objective of this paper is to propose guidelines on how to best search for evidence when updating SLRs in SE, and to evaluate these guidelines using an SLR that was not employed during the formulation of the guidelines. Method To propose our guidelines, we compare and discuss outcomes from applying different search strategies to identify primary studies in a published SLR, an SLR update, and two replications in the area of effort estimation. These guidelines are then evaluated using an SLR in the area of software ecosystems, its update and a replication. Results The use of a single iteration forward snowballing with Google Scholar, and employing as a seed set the original SLR and its primary studies is the most cost-effective way to search for new evidence when updating SLRs. Furthermore, the importance of having more than one researcher involved in the selection of papers when applying the inclusion and exclusion criteria is highlighted through the results. Conclusions Our proposed guidelines formulated based upon an effort estimation SLR, its update and two replications, were supported when using an SLR in the area of software ecosystems, its update and a replication. Therefore, we put forward that our guidelines ought to be adopted for updating SLRs in SE.

[75] arXiv:2006.05540 [pdf]

Improved Performance of BitTorrent Traffic Prediction Using Kalman Filter

Harisankar Sadasivan, Pranav Channakeshava, Pathipati Srihari

Supervising internet traffic is essential for any Internet Service Provider (ISP) to dynamically allocate bandwidth in an optimized manner. BitTorrent is a well-known peer-to-peer file-sharing protocol for bulky file transfer. Its extensive bandwidth consumption affects the Quality of Service (QoS) and causes latency to other applications. There is a strong requirement to predict the BitTorrent traffic to improve the QoS. In this paper, we propose a Kalman filter (KF) based method to predict the network traffic for various traffic data sets. The observed performance of KF is superior in terms of both Mean Squared Error (MSE) and total computation time, when compared to Auto Regressive Moving Average (ARMA) model.

[76] arXiv:2006.05522 [pdf]

Compact SQUID realized in a double layer graphene heterostructure

David I. Indolese, Paritosh Karnatak, Artem Kononov, Raphaëlle Delagrange, Roy Haller, Lujun Wang, Péter Makk, Kenji Watanabe, Takashi Taniguchi, Christian Schönenberger

Two-dimensional systems that host one-dimensional helical states are exciting from the perspective of scalable topological quantum computation when coupled with a superconductor. Graphene is particularly promising for its high electronic quality, versatility in van der Waals heterostructures and its electron and hole-like degenerate 0$th$ Landau level. Here, we study a compact double layer graphene SQUID (superconducting quantum interference device), where the superconducting loop is reduced to the superconducting contacts, connecting two parallel graphene Josephson junctions. Despite the small size of the SQUID, it is fully tunable by independent gate control of the Fermi energies in both layers. Furthermore, both Josephson junctions show a skewed current phase relationship, indicating the presence of superconducting modes with high transparency. In the quantum Hall regime we measure a well defined conductance plateau of 2$e^2/h$ an indicative of counter propagating edge channels in the two layers. Our work opens a way for engineering topological superconductivity by coupling helical edge states, from graphene's electron-hole degenerate 0$th$ Landau level via superconducting contacts.

[77] arXiv:2006.05521 [pdf]

Supervised Learning of Sparsity-Promoting Regularizers for Denoising

Michael T. McCann, Saiprasad Ravishankar

We present a method for supervised learning of sparsity-promoting regularizers for image denoising. Sparsity-promoting regularization is a key ingredient in solving modern image reconstruction problems; however, the operators underlying these regularizers are usually either designed by hand or learned from data in an unsupervised way. The recent success of supervised learning (mainly convolutional neural networks) in solving image reconstruction problems suggests that it could be a fruitful approach to designing regularizers. As a first experiment in this direction, we propose to denoise images using a variational formulation with a parametric, sparsity-promoting regularizer, where the parameters of the regularizer are learned to minimize the mean squared error of reconstructions on a training set of (ground truth image, measurement) pairs. Training involves solving a challenging bilievel optimization problem; we derive an expression for the gradient of the training loss using Karush-Kuhn-Tucker conditions and provide an accompanying gradient descent algorithm to minimize it. Our experiments on a simple synthetic, denoising problem show that the proposed method can learn an operator that outperforms well-known regularizers (total variation, DCT-sparsity, and unsupervised dictionary learning) and collaborative filtering. While the approach we present is specific to denoising, we believe that it can be adapted to the whole class of inverse problems with linear measurement models, giving it applicability to a wide range of image reconstruction problems.

[78] arXiv:2006.05513 [pdf]

A Deep Learning-Based Method for Automatic Segmentation of Proximal Femur from Quantitative Computed Tomography Images

Chen Zhao, Joyce H. Keyak, Jinshan Tang, Tadashi S. Kaneko, Sundeep Khosla, Shreyasee Amin, Elizabeth J. Atkinson, Lan-Juan Zhao, Michael J. Serou, Chaoyang Zhang, Hui Shen, Hong-Wen Deng, Weihua Zhou

Purpose Proximal femur image analyses based on quantitative computed tomography (QCT) provide a method to quantify the bone density and evaluate osteoporosis and risk of fracture. We aim to develop a deep-learning-based method for automatic proximal femur segmentation. Methods and Materials We developed a 3D image segmentation method based on V-Net, an end-to-end fully convolutional neural network (CNN), to extract the proximal femur QCT images automatically. The proposed V-net methodology adopts a compound loss function, which includes a Dice loss and a L2 regularizer. We performed experiments to evaluate the effectiveness of the proposed segmentation method. In the experiments, a QCT dataset which included 397 QCT subjects was used. For the QCT image of each subject, the ground truth for the proximal femur was delineated by a well-trained scientist. During the experiments for the entire cohort then for male and female subjects separately, 90% of the subjects were used in 10-fold cross-validation for training and internal validation, and to select the optimal parameters of the proposed models; the rest of the subjects were used to evaluate the performance of models. Results Visual comparison demonstrated high agreement between the model prediction and ground truth contours of the proximal femur portion of the QCT images. In the entire cohort, the proposed model achieved a Dice score of 0.9815, a sensitivity of 0.9852 and a specificity of 0.9992. In addition, an R2 score of 0.9956 (p<0.001) was obtained when comparing the volumes measured by our model prediction with the ground truth. Conclusion This method shows a great promise for clinical application to QCT and QCT-based finite element analysis of the proximal femur for evaluating osteoporosis and hip fracture risk.

[79] arXiv:2006.05509 [pdf]

Can artificial intelligence (AI) be used to accurately detect tuberculosis (TB) from chest x-ray? A multiplatform evaluation of five AI products used for TB screening in a high TB-burden setting

Zhi Zhen Qin, Shahriar Ahmed, Mohammad Shahnewaz Sarker, Kishor Paul, Ahammad Shafiq Sikder Adel, Tasneem Naheyan, Sayera Banu, Jacob Creswell

Powered by artificial intelligence (AI), particularly deep neural networks, computer aided detection (CAD) tools can be trained to recognize TB-related abnormalities on chest radiographs, thereby screening large numbers of people and reducing the pressure on healthcare professionals. Addressing the lack of studies comparing the performance of different products, we evaluated five AI software platforms specific to TB CAD4TB (v6), InferReadDR (v2), Lunit INSIGHT for Chest Radiography (v4.9.0) , JF CXR-1 (v2) by and qXR (v3) by on an unseen dataset of chest X-rays collected in three TB screening center in Dhaka, Bangladesh. The 23,566 individuals included in the study all received a CXR read by a group of three Bangladeshi board-certified radiologists. A sample of CXRs were re-read by US board-certified radiologists. Xpert was used as the reference standard. All five AI platforms significantly outperformed the human readers. The areas under the receiver operating characteristic curves are qXR 0.91 (95% CI0.90-0.91), Lunit INSIGHT CXR 0.89 (95% CI0.88-0.89), InferReadDR 0.85 (95% CI0.84-0.86), JF CXR-1 0.85 (95% CI0.84-0.85), CAD4TB 0.82 (95% CI0.81-0.83). We also proposed a new analytical framework that evaluates a screening and triage test and informs threshold selection through tradeoff between cost efficiency and ability to triage. Further, we assessed the performance of the five AI algorithms across the subgroups of age, use cases, and prior TB history, and found that the threshold scores performed differently across different subgroups. The positive results of our evaluation indicate that these AI products can be useful screening and triage tools for active case finding in high TB-burden regions.

[80] arXiv:2006.05498 [pdf]

On Decidability of Time-bounded Reachability in CTMDPs

Rupak Majumdar, Mahmoud Salamati, Sadegh Soudjani

We consider the time-bounded reachability problem for continuous-time Markov decision processes. We show that the problem is decidable subject to Schanuel's conjecture. Our decision procedure relies on the structure of optimal policies and the conditional decidability (under Schanuel's conjecture) of the theory of reals extended with exponential and trigonometric functions over bounded domains. We further show that any unconditional decidability result would imply unconditional decidability of the bounded continuous Skolem problem, or equivalently, the problem of checking if an exponential polynomial has a non-tangential zero in a bounded interval. We note that the latter problems are also decidable subject to Schanuel's conjecture but finding unconditional decision procedures remain longstanding open problems.

You can also browse papers in other categories.