Density Propagation with Characteristics-based Deep Learning. (arXiv:1911.09311v1 [math.DS])

Uncertainty propagation in nonlinear dynamic systems remains an outstanding problem in scientific computing and control. Numerous approaches have been developed, but are limited in their capability to tackle problems with more than a few uncertain variables or require large amounts of simulation data. In this paper, we propose a data-driven method for approximating joint probability…

Data Proxy Generation for Fast and Efficient Neural Architecture Search. (arXiv:1911.09322v1 [cs.LG])

Due to the recent advances on Neural Architecture Search (NAS), it gains popularity in designing best networks for specific tasks. Although it shows promising results on many benchmarks and competitions, NAS still suffers from its demanding computation cost for searching high dimensional architectural design space, and this problem becomes even worse when we want to…

Multi-objective Neural Architecture Search via Predictive Network Performance Optimization. (arXiv:1911.09336v1 [cs.LG])

Neural Architecture Search (NAS) has shown great potentials in finding a better neural network design than human design. Sample-based NAS is the most fundamental method aiming at exploring the search space and evaluating the most promising architecture. However, few works have focused on improving the sampling efficiency for a multi-objective NAS. Inspired by the nature…

Hybrid quantile estimation for asymmetric power GARCH models. (arXiv:1911.09343v1 [econ.EM])

Asymmetric power GARCH models have been widely used to study the higher order moments of financial returns, while their quantile estimation has been rarely investigated. This paper introduces a simple monotonic transformation on its conditional quantile function to make the quantile regression tractable. The asymptotic normality of the resulting quantile estimators is established under either…

Examining the impact of data quality and completeness of electronic health records on predictions of patients risks of cardiovascular disease. (arXiv:1911.08504v1 [stat.AP])

The objective is to assess the extent of variation of data quality and completeness of electronic health records and impact on the robustness of risk predictions of incident cardiovascular disease (CVD) using a risk prediction tool that is based on routinely collected data (QRISK3). The study design is a longitudinal cohort study with a setting…

Gromov-Wasserstein Factorization Models for Graph Clustering. (arXiv:1911.08530v1 [cs.LG])

We propose a new nonlinear factorization model for graphs that are with topological structures, and optionally, node attributes. This model is based on a pseudometric called Gromov-Wasserstein (GW) discrepancy, which compares graphs in a relational way. It estimates observed graphs as GW barycenters constructed by a set of atoms with different weights. By minimizing the…

A Framework for Challenge Design: Insight and Deployment Challenges to Address Medical Image Analysis Problems. (arXiv:1911.08531v1 [stat.AP])

In this paper we aim to refine the concept of grand challenges in medical image analysis, based on statistical principles from quantitative and qualitative experimental research. We identify two types of challenges based on their generalization objective: 1) a deployment challenge and 2) an insight challenge. A deployment challenge’s generalization objective is to find algorithms…

Robust Learning of Discrete Distributions from Batches. (arXiv:1911.08532v1 [cs.LG])

Let $d$ be the lowest $L_1$ distance to which a $k$-symbol distribution $p$ can be estimated from $m$ batches of $n$ samples each, when up to $\beta m$ batches may be adversarial. For $\beta<1/2$, Qiao and Valiant (2017) showed that $d=\Omega(\beta/\sqrt{n})$ and requires $m=\Omega(k/\beta^2)$ batches. For $\beta<1/900$, they provided a $d$ and $m$ order-optimal algorithm…

Heterogeneous Deep Graph Infomax. (arXiv:1911.08538v1 [cs.LG])

Graph representation learning is to learn universal node representations that preserve both node attributes and structural information. The derived node representations can be used to serve various downstream tasks, such as node classification and node clustering. When a graph is heterogeneous, the problem becomes more challenging than the homogeneous graph node learning problem. Inspired by…

Prediction Focused Topic Models for Electronic Health Records. (arXiv:1911.08551v1 [cs.LG])

Electronic Health Record (EHR) data can be represented as discrete counts over a high dimensional set of possible procedures, diagnoses, and medications. Supervised topic models present an attractive option for incorporating EHR data as features into a prediction problem: given a patient’s record, we estimate a set of latent factors that are predictive of the…