### 2020

Graph regularization plays an important role in electroencephalography (EEG) source imaging and hyperspectral unmixing, both are challenging and ill-posed inverse problems. Specifically for EEG brain source localization, we develop a graph Laplacian-placed model that explicitly takes into account the label information (happiness, sadness, surprise, etc). We also incorporate some modern regularizations, such as the L1 norm, total variation, and nuclear norm (low rank) to further improve the accuracy. Ont the other hand, we consider the use of the graph total variation (gTV) regularization for blind hyperspectral unmixing, which is about identifying the pure spectra of individual materials(i.e., endmembers) and their proportaions (i.e., abundances) at each pixel. To further alleviate the computational cost, we apply the Nystrom method to approximate a fully-connected graph by a small subset of samplet points and adopt the Merriman-Bence-Osher (MBO) scheme to solve the gTV-involved subproblem by decomposing a grayscale image into a bit-wise form...

Abstract: Abstract: Many data in real-world applications lie in a high-dimensional space but are concentrated on or near a low-dimensional manifold. Our goal is to estimate functions on the manifold from finite samples of data. This talk focuses on an efficient approximation theory of deep ReLU networks for functions supported on low-dimensional manifolds. We construct a ReLU network for such function approximation where the size of the network grows exponentially with respect to the intrinsic dimension of the manifold. When the function is estimated from finite samples, we proved that the mean squared error for the function approximation converges as the training samples increases with a rate depending on the intrinsic dimension of the manifold instead of the ambient dimension of the space. These results demonstrate that deep neural networks are adaptive to low-dimensional geometric structures of data. This is a joint work with Minshuo Chen, Haoming Jiang, Tuo Zhao (Georgia Institute of Technology).

Abstract: Quantized neural networks are attractive due to their inference efficiency. However, training quantized neural networks involves minimizing a discontinuous and piecewise constant loss function. Such a loss function has zero gradient almost everywhere (a.e.) with respect to weight variables, which makes the conventional back-propagation inapplicable. To this end, we study a class of biased first-order oracle, termed coarse gradient, for overcoming the vanishing gradient issue in the regression and classification settings, respectively. In addition, we propose an efficient computational method featuring a blended coarse gradient step, for network quantization, which achieves the state-of-the-art accuracies in image classification without extensive tuning of hyper-parameters.

Abstract:

We introduce a novel idea, the WaveHoltz iteration, for solving the Helmholtz equation inspired by recent work on exact controllability (EC) methods. As in EC methods our method make use of time domain methods for wave equations to design frequency domain Helmholtz solvers but unlike EC methods we do not require adjoint solves. We show that the WaveHoltz iteration we propose is symmetric and positive definite in the continuous setting. We also present numerical examples, using various discretization techniques, that show that our method can be used to solve problems with rather high wave numbers...

Abstract:

"We study structured multi-armed bandits, which is the problem of online decision-making under uncertainty in the presence of structural information. In this problem, the decision-maker needs to discover the best course of action despite observing only uncertain rewards over time. The decision-maker is aware of certain structural information regarding the reward distributions and would like to minimize his regret by exploiting this information, where the regret is its performance difference against a benchmark policy which knows the best action ahead of time. In the absence of structural information, the classical UCB and Thomson sampling algorithms are well known to suffer only minimal regret. Neither algorithms is, however, capable of exploiting structural information which is commonly available in practice. We propose a novel learning algorithm which we call ``DUSA'' whose worst-case regret matches the information-theoretic regret lower bound up to a constant factor and can handle a wide range of structural information. Our algorithm DUSA solves a dual/robust counterpart of regret lower bound at the empirical reward distribution and follows the suggestion made by the dual problem. Our proposed algorithm is the first computationally viable learning policy for structured bandit problems that suffers asymptotic minimal regret.

### 2019

Abstract: We present a new statistical framework to quantify uncertainty (UQ) for recovering low-rank matrices from incomplete and noisy observations. We further develop a sequential active sampling approach guided by the uncertainties. The motivation comes from two related and widely studied problems, matrix completion, which aims to recover a low-rank matrix X from a partial, noisy observation of its entries, and low-rank matrix recovery, which recovers X from a set of linear combination its entries with additive noise. The proposed framework reveals several novel insights on the role of coherence and uncertainty quantification for noisy matrix completion and recovery. Using such insights, we develop an efficient posterior sampler for UQ, which is then used to guide a closed-form sampling scheme for matrix entries. We showed the competitive performance of this integrated sampling / UQ methodology in simulation studies and applications to collaborative filtering and imaging compared with existing approaches. This is joint work with Simon Mak at Duke University and Shaowu Yuchi at Georgia Tech.

Abstract: The primary visual cortex (V1) of mammals is an area of the brain that receives and interprets visual input. Neurons in V1 respond preferentially to a particular orientation angle of a visual stimulus. In mammals such as cats, V1 contains an ordered map of the orientation preference of each neuron, with cells preferring similar angles residing close to one another. In mice, however, the map of orientation preference appears random and disordered, with little correlation between preferred orientation and location in cortical space. Though much is known about orientation-preference maps in adult mammals, the mechanism underlying the formation of these maps during development is still unknown. In particular, I am interested in understanding under which circumstances does the map that forms appear ordered (like in cats) or disordered (like in mice). In this talk, I will discuss a mathematical model used to describe the development of neuronal networks in V1 and suggest a testable hypothesis for the mechanism underlying the formation of either an ordered or disordered orientation-preference map.

Abstract: Randomized algorithms and greedy algorithms are both widely used in practice. In this talk, I'll present a random-then-greedy procedure, and showcase how it improves two important machine learning algorithms: stochastic gradient descent and gradient boosting machine. In the first part of the talk, we propose a new stochastic optimization framework for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss. In contrast, we develop a computationally efficient method to construct a gradient estimator that is purposely biased toward those observations with higher current losses. On the theory side, we show that the proposed method minimizes a new ordered modification of the empirical average loss, and is guaranteed to converge at a sublinear rate to a global optimum for convex loss and to a critical point for weakly convex (non-convex) loss. Furthermore, we prove a new generalization bound for the proposed algorithm. On the empirical side, the numerical experiments show that our proposed method consistently improves the test errors compared with the standard mini-batch SGD in various models including SVM, logistic regression, and deep learning problems...

Abstract: We develop two efficient stochastic gradient-based algorithms to solve a class of composite nonconvex optimization problems that covers both finite-sum and expectation settings. In the first part of the talk, we propose a new stochastic proximal gradient framework that utilizes a well-known biased stochastic estimator called SARAH introduced by Nguyen et all 2017. The algorithm consists of two steps: a proximal gradient and an everaging step making it different from existing nonconvex proximal-type algorithms. It only requires an average smoothness assumption of the nonconvex objective term and additional bounded variance assumption if applied to expectation problems. It works with both constant and adaptive step-sizes, while allowing single-sample and minibatches. In all these cases, we prove that our algorithms can achieve the best-known complexity bounds. In the second part of this talk, we introduce a novel hybrid approach to form new stochastic estimators for objective functions and propose a hybrid stochastic proximal gradient method to solve composite nonconvex optimization problems...

Abstract:

The prediction of interactions between nonlinear laser beams is a longstanding open problem. A traditional assumption is that these interactions are deterministic. We have shown, however, that in the nonlinear Schrodinger equation (NLS) model of laser propagation, beams lose their initial phase information in the presence of input noise. Thus, the interactions between beams become unpredictable as well.

Computationally, these predictions are enabled through a novel spline-based stochastic computational method. Our algorithm efficiently estimates probability density functions (PDF) that result from differential equations with random input. This is a new and general problem in numerical uncertainty-quantification (UQ), which leads to surprising results and analysis at the intersection of probability and transport theory.

Amir Sagiv is a Chu Assistant Professor at Columbia University's department of Applied Physics and Applied Mathematics. Before that, he completed his Ph.D. in Applied Mathematics at Tel Aviv University under the supervision of Gadi Fibich and Adi Ditkowski, and earned his B.Sc. in Mathematics and Physics at the Hebrew University of Jerusalem.

Abstract:

Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation problems. The premise is that despite nonconvexity, the loss function may possess benign geometric properties that enable fast global convergence under carefully designed initializations. In many sample-starved problems, this benign geometry only holds over a restricted region of the entire parameter space with certain structural properties, yet gradient descent seems to follow a trajectory staying within this nice region without explicit regularizations, thus is extremely computationally efficient and admits strong statistical guarantees. In this talk, we will formally establish this “implicit regularization” phenomenon of gradient descent for the fundamental problem of estimating low-rank matrices from noisy incomplete, rank-one, or bilinear measurements, by exploiting statistical modeling in analyzing iterative optimization algorithms via a leave-one-out perturbation argument. We will also discuss an approach called on median-guided truncated gradient descent which is provably resilient to outliers while maintaining computational and statistical efficiency.

Abstract:

Fast-slow systems are notoriously difficult to simulate. Numerical stability limitations force explicit integrators to waste cycles resolving the fast dynamics. Implicit integrators often have the stability to step over the daunting short timescale, but suffer from poorly conditioned nonlinear solves generating large timesteps, as well as accuracy issues when stepping over potentially-relevant short-timescale dynamics. In this talk I will propose a general strategy for developing improved implicit integrators that exploit a fast-slow system's slow manifold. These slow manifold integrators enjoy well-conditioned nonlinear solves, and useful error estimates.

Specifically, errors are well-controlled _near the slow manifold_. After presenting the general approach, I will discuss a nontrivial application of slow manifold integration to charged particle dynamics in a strong magnetic field.