Computer Science Colloquia & Seminars are held each semester and sponsored by the Computer Science department. Faculty invite speakers from all areas of computer science, and the talks are open to all members of the RPI community.
Computer Science Colloquia & Seminars are held each semester and sponsored by the Computer Science department. Faculty invite speakers from all areas of computer science, and the talks are open to all members of the RPI community.
Wednesday, February 24 th
Each presenter will host in their own Webex room. Feel free to move around room to room to hear about their research!
A Machine Learning Approach to Predicting Continuous Tie Strengths
Fake it Till You Make it: Self-supervised Semantic Shifts for Monolingual Word Embeddings Tasks
Inferring Degrees from Incomplete Networks and Nonlinear Dynamics
Applying Network Science to Political Polarization
In order to restore normalcy, all guests are expected to:
The excitement of technology is contagious. Technologies, such as artificial intelligence and block chain promise to revolutionize our society and our lives. And yet technology creation still lacks significant participation from many people, including women, black, and hispanic engineers and computer scientists. Dr. Whitney, a computer scientist, entrepreneur and world recognized expert on women in technology will speak on the dream of inclusion in technology. She will discuss her own journey as a women engineer, and her efforts to create a global movement for women technologists. She will discuss the landscape today, why organizations benefit from creating cultures where all people thrive, and concrete actions you can take to create meaningful change.
Bio: Telle Whitney is a senior executive leader, an entrepreneur, a recognized expert on diversity and inclusion, and a true pioneer on the issue of women and technology. She has over 20 years of leadership experience, and was named one of Fast Company’s Most Influential Women in Technology. She is a frequent speaker on the topic of Women and Technology. Telle co-founded the Grace Hopper Celebration of Women in Computing Conference with Anita Borg in 1994 and served as CEO of the Anita Borg Institute from 2002
to September 2017. She transformed the Institute into a recognized world leader for women and technology.
She has won numerous awards including the ACM distinguished service award, an honorary degree from CMU, and is an honorary member of IEEE. She serves on multiple boards and advisory councils. She is also the co-founder of the National Center for Women and Information Technology (NCWIT). Telle holds a Ph.D. and M.S. in Computer Science from the California Institute of Technology and a BS in Computer Science from the University of Utah
Investigating the frequency and distribution of small subgraphs with a few nodes/edges, i.e., motifs, is an effective analysis method for static networks. Motif-driven analysis is also useful for temporal networks where the number of motifs is significantly larger due to the additional temporal information on edges. This variety makes it challenging to design a temporal motif model that can consider all aspects of temporality. In the literature, previous works have introduced various models that handle different characteristics. In this work, we compare the existing temporal motif models and evaluate the facets of temporal networks that are overlooked in the literature. We first survey four temporal motif models and highlight their differences. Then, we evaluate the advantages and limitations of these models with respect to the temporal inducedness and timing constraints. We also suggest a new lens, event pairs, to investigate cause-effect relations. We demonstrate that the existing models yield biased results and discuss our efforts for mitigating those with examples from financial transaction networks.
Bio: A. Erdem Sariyuce is an Assistant Professor in Computer Science and Engineering at the University at Buffalo. Prior to that, he was the John von Neumann postdoctoral fellow at Sandia National Laboratories. Erdem received his Ph.D. in Computer Science from the Ohio State University. He conducts research on large-scale graph mining with a particular focus on practical algorithms to explore and process real-world networks.
A graph is a beautiful mathematical concept that can concisely model any interacting system such as protein interactions in organisms, chemical bonds in compounds, and friendships on social networks. Thus, machine learning on graphs to predict edges, node features, and community structures plays an important role in computational biology, social science, neurology, and computational chemistry.
In this talk, I will discuss computational building blocks needed to design graph machine learning algorithms including graph embedding, graph neural networks (GNNs) and graph visualization. Computationally, major steps in these algorithms can be mapped to sampled dense-dense matrix multiplication (SDDMM) and sparse-dense matrix multiplication (SpMM). We can even fuse these two operations into a single kernel called FusedMM that captures almost all computational patterns needed by popular graph embedding and GNN approaches. I will then discuss parallel algorithms for these kernels. As the performances of these linear-algebraic operations are bound by the memory bandwidth, we develop algorithms that minimize data movements from the main memory and utilize cache and registers as much as possible. We also auto tune these operations so that the same code can perform equally well on Intel, AMD, IBM Power9 and ARM Processors. The kernel-based design and tuned parallel implementations speed up end-to-end graph embedding and GNN algorithms by up to 28x over existing approaches based on the message passing paradigm. The take home message of this talk is that developing graph machine learning algorithms using efficient building blocks provides a cleaner interface to application developers and boosts the performance of end-to-end graph learning applications.
Bio: Dr. Ariful Azad is an Assistant Professor of Intelligent Systems Engineering at Luddy School of Informatics, Computing, and Engineering in Indiana University (IU). He was a Research Scientist in the Computational Research Division at Lawrence Berkeley National Laboratory. Dr. Azad obtained Ph.D. from Purdue University and B.S. from Bangladesh University of Engineering and Technology. His research interests are in graph machine learning, sparse matrix algorithms, high-performance computing, and bioinformatics. His interdisciplinary research group strives to solve large-scale problems in genomics, earth science, and scientific computing.
Graph convolutional networks (GCNs) are a widely used method for graph representation learning. To elucidate the capabilities and limitations of GCNs, we investigate their power, as a function of their number of layers, to distinguish between different random graph models (corresponding to different class-conditional distributions in a classification problem) on the basis of the embeddings of their sample graphs. In particular, the graph models that we consider arise from graphons, which are the most general possible parameterizations of infinite exchangeable graph models and which are the central objects of study in the theory of dense graph limits. We give a precise characterization of the set of pairs of graphons that are indistinguishable by a GCN of moderate depth. This characterization is in terms of a degree profile closeness property. Outside this class, a very simple GCN architecture suffices for distinguishability. We then explore the effect of the embedding dimension on classification performance: the impact is minimal for the problem of distinguishing between two graphons, but plays a clear role in the multi-hypothesis setting.
These results theoretically match empirical observations of several prior works on GCNs. To prove our results, we exploit a connection to random walks on graphs.
Bio: Abram Magner is an assistant professor in the Department of Computer Science at the University at Albany, SUNY. Prior to that, he held postdoctoral fellowships in the Department of EECS at the University of Michigan and in the Coordinated Science Lab at the University of Illinois at Urbana-Champaign. His interests are in theoretical machine learning, probabilistic inference problems on graphs and other random structures, and information theory.
Abstract: Recent innovations in natural language processing and deep learning not only excel at numerous tasks in industry and academia but also keep impacting our daily lives. However, there remain challenges regarding NLP modeling for specific tasks and model deployment in the real world. This talk will discuss new models to improve the performance in emerging tasks and novel algorithms to enhance the scalability of model deployment.
For the challenges regarding modeling, this talk will demonstrate a new algorithm for machine translation with low-resource language pairs, by bridging source and target languages with a pivot language. This talk will also propose the first end-to-end conditional generative model for generating paraphrases via adversarial training. At the same time, to scale up the model deployment, this talk will consider challenges in both training and inference. For training, this talk will present a novel model-parallel algorithm to parallelize the training of Transformer-based language models with over a billion parameters. For inference, this talk will consider platforms with limited memory and computation ability, such as mobile devices, and introduce different strategies to transform continuous and generic sentence embeddings into a binarized form, while preserving their rich semantic information. Finally, I will discuss applications of natural language processing techniques in interdisciplinary research, including biomedical data science, human-computer interaction, and education.
Bio: Qian (Lara) Yang is a postdoctoral researcher at Duke University, working with Lawrence Carin. Before that, she worked with Rebecca J. Passonneau at both Columbia University and Pennsylvania State University, and Hermann Ney at RWTH Aachen University in Germany. She received her Ph.D. in Computer Science and Technology from Tsinghua University, advised by Gerard de Melo. Her research focuses on a wide range of topics in Natural Language Processing, Machine Learning, and the intersection of Natural Language Processing with Biomedical Data Science, Education, and Human-Computer Interaction. She is the recipient of the Educational Advances in Artificial Intelligence (EAAI) New and Future AI Educator award (funded by grants from NSF, AIJ, AAAI), the Google Women Techmakers Scholarship, an RWTH – Tsinghua Research Fellowship (provided by DAAD in Germany), and nominated for the 2019 Outstanding Postdoc at Duke University award. She has served as a session chair at AAAI and on program committees for major conferences such as ACL, EMNLP, NAACL, COLING, NeurIPS, ICML, AAAI, IJCAI, and KDD.
Humans learn to perceive the world through multiple modalities including visual, auditory, and kinesthetic stimuli. The need for perception is self-evident while humans invented language for communication and documentation. Therefore, language and perception lay foundations for artificial intelligence, and how to ground natural language onto real-world perception is a fundamental challenge to empower various practical applications that require human-machine communication.
In this talk, I will mainly present two of my research thrusts on developing intelligent embodied agents that connect language, vision, and actions, and that communicate with humans in the real world. First, moving beyond natural language understanding from text-only corpora, I have situated natural language inside interactive environments where communication often takes place (language—>vision). So I will discuss how to effectively ground natural language instructions and visual inputs to actions in real-world navigation tasks with reinforcement learning and imitation learning. Second, in order to enable an agent to describe the visual surroundings for humans (vision—>language), I will explore challenges of language generation conditioned on visual context, and present novel solutions from coarse-grained to fine-grained caption generation, and then to humanlike story generation. In the end, I will conclude with my future research plan.
Bio: Xin Wang is a Ph.D. candidate at the University of California, Santa Barbara. His research interests include natural language processing, computer vision, and machine learning, especially the intersection of them. He works on fundamental research directions that enable intelligent embodied agents to communicate with humans in the real world. He published over 20 papers at top-tier CV, NLP, and ML venues such as CVPR, ICCV, ECCV, ACL, NAACL, EMNLP, AAAI, TPAMI. He received the CVPR Best Student Paper Award in 2019. Xin is also professionally active and have organized multiple academic events on the topic of his research, including workshops at ACL 2020, CVPR 2020, and ICCV 2019, and a tutorial at AACL-IJCNLP 2020. He also served as a session chair for the NLP session at AAAI 2019. He worked at Google AI, Facebook AI Research, Microsoft Research (Redmond), and Adobe Research.
Complex systems are often associated with heterogeneous data (e.g., structural graph, unstructured content, time series). The heterogeneous data provide opportunities for researchers and practitioners to understand complex systems more comprehensively but also pose challenges (in terms of data extraction, representation, and fusion) to discover knowledge from them. The goal of my research is to harness the power of heterogeneous data, turn them into useful knowledge, develop artificial intelligence solutions based on the extracted knowledge for a diverse set of real-world applications. In this talk, I will present my work for learning from heterogeneous data through fusion, which will be introduced in the perspectives of relation and text fusion for personalization, graph and content fusion for graph embedding, and temporal and spatial context fusion for anomaly detection. Extending from the current research, I will describe efficient and interpretable learning from heterogeneous data as future agenda.
Bio: Chuxu Zhang is a Ph.D. Candidate at the Department of Computer Science and Engineering in University of Notre Dame, under the supervision of Prof. Nitesh V. Chawla since May 2017. Before that, he received master and bachelor degrees from Rutgers University and University of Electronic Science and Technology of China, respectively. His research interests are data science, machine learning, artificial intelligence, and their applications in graphs/networks mining, recommendation/personalization, time series/spatial-temporal data analysis, natural language processing, chemical synthesis, etc. His works have appeared in premier data science and artificial intelligence conferences including KDD, WWW, AAAI, IJCAI, WSDM, CIKM, SDM. He has served as the conference PC member in ICLR, KDD, AAAI, IJCAI, CIKM, ICDM, SDM, etc., and the journal reviewer for TKDE, TNNLS, DMKD, IP&M, etc. He received the Best Paper Award Candidates of WWW 2019 and the Best Student Paper Award of WAIM 2016. Besides academic experiences, he also has research intern experiences at Microsoft Research, NEC Labs America, and IBM Watson Research Center. More information can be found from his website: https://chuxuzhang.github.io/
In the era of Internet of Things, mobile computing and Big Data, millions of sensors and mobile devices are constantly generating massive volumes of data. To utilize the vast amount of data without violating data privacy, Federated Learning has emerged as a new paradigm of distributed machine learning that orchestrates model training across mobile devices. In this talk, I will first introduce the current challenges in distributed machine learning, and then present my recent work on statistical heterogeneity in federated learning, and distributed machine learning on a serverless architecture. Specifically, I will talk about applying reinforcement learning to optimize distributed machine learning by learning the best choice for task scheduling and resource provisioning.
Bio: Hao Wang is a 5th year Ph.D. candidate at the University of Toronto under the supervision of Professor Baochun Li. Hao received both of his B.E. degree in Information Security and M.E. degree in Software Engineering from Shanghai Jiao Tong University in 2012 and 2015, respectively. His research interests include large-scale data analytics, distributed machine learning, and datacenter networking. He has published 16 papers (including eight first-author papers) in prestigious networking and system conferences and journals, such as INFOCOM, SoCC, TPDS, and ToN. For more information about Hao, please visit https://www.haow.ca/.
How can we make conversational agents more intelligent? How can we make sure that we are making progress towards machine understanding of language? My work addresses these questions, and the answers have the potential to change how dialogue systems are designed and evaluated.
In my talk, I will discuss several research projects which investigate these questions and demonstrate how human language can mirror and predict human behavior, and even how certain human behaviors can be automated. Based on this body of work, I will also discuss methods for validating model output, which brings in novel perspectives from experiment design and cognitive biases.
Bio: Samira Shaikh is an Assistant Professor of Cognitive Science in the Department of Computer Science at the University of North Carolina - Charlotte (UNCC). She holds a joint appointment with the Department of Psychological Science and an affiliate faculty position at the Data Science Initiative at UNCC. Her research interests are in the areas of Computational Sociolinguistics, Dialogue Systems and Artificial Intelligence. She has published over 50 articles in prestigious venues, including most recently at AAAI and CHI. Her work has been funded by federal agencies, including IARPA, DARPA, Army Research Office, Army Research Laboratory, and NSF, as well as by industry (NVIDIA).
Who? What? When? Where? Why? are fundamental questions asked when gathering knowledge about and understanding a concept, topic, or event. The answers to these questions underpin the key information conveyed in the overwhelming majority, if not all, of language-based communication. Unfortunately, typical machine learning models and Information Extraction (IE) techniques heavily rely on human annotated data, which is usually very expensive and only available and compiled for very limited types or languages, rendering them incapable of dealing with information across various domains, languages, or other settings.
In this talk, I will introduce a new information extraction paradigm - Cold-Start Universal Information Extraction, which aims to create the next generation of information access where machines can automatically discover accurate, concise, and trustworthy information embedded in data of any form without requiring any human effort. Principally, my efforts along this line go towards three questions: (1) How can machines automatically discover the key information from texts without any pre-defined types or any human annotated data? (2) How can machines benefit from available resources, e.g., large-scale ontologies or existing human annotations? (3) How can information extraction approaches be extent to low-resource languages without any extra human effort? My research answers these questions with three key research innovations: a Liberal Information Extraction framework which bottom-up discovers structured information and automatically induces a type schema, a Zero-shot IE approach which reframes IE as a grounding problem instead of classification, and a multilingual common semantic space framework which retains clustering structures in each language and enables IE to be feasible for thousands of languages. I will conclude my talk by showing what are the remaining challenges and discussing several future research directions.
Bio: Lifu Huang is a PhD candidate at the Computer Science Department of University of Illinois at Urbana-Champaign. He has a wide range of research interests in natural language processing and understanding. Specifically, his current research focuses on developing efficient information extraction approaches to automatically extract structured knowledge from any forms of data at little to no cost. He received his M.S. from Peking University in 2014 with the highest university honor and National Scholarship. He has served as the Program Committee member for many top NLP and AI venues including ACL, EMNLP, NAACL, AAAI, etc. He also received the fellowship from Allen Institute for Artificial Intelligence (AI2) in 2019.
Recently, the use of Human-Computer Interaction (HCI) and Artificial Intelligence (AI) for promoting social good and pursuing sustainability has begun to receive wide attention. Conventionally, scientists lead projects that engage lay people in research activities. However, using this top-down approach to "parachute" technology intervention without thoughtful consideration may cause irreversible harm to communities. We introduce an alternative framework, Community Citizen Science (CCS), to closely connect research and social issues by empowering communities to produce scientific knowledge, represent their needs, address their concerns, and advocate for impact. CCS advances the current science-oriented concept to a deeper level that aims to sustain community engagement when researchers are no longer involved after the intervention of interactive systems. This seminar will introduce the CCS concept through three deployed systems that applied HCI and AI techniques, as well as future directions in this field. We envision that CCS can drive HCI and AI toward citizen empowerment at a time when community concerns, sustainability issues, and technological ethics are at the forefront of global social discourse.
Bio: Yen-Chia Hsu is a computer scientist with an architectural design background. He designs, implement, deploy, and evaluate interactive systems that support community empowerment, especially for sustainability issues. Currently, he is a Project Scientist in the CREATE Lab at Carnegie Mellon University (CMU). He received his Ph.D. degree in Robotics in 2018 from the Robotics Institute at CMU. Previously, he received his Master's degree in tangible interaction design in 2012 from the School of Architecture at CMU, where he studied and built prototypes of interactive robots and wearable devices. Before CMU, he earned his dual Bachelor's degree in both architecture and computer science in 2010 at National Cheng Kung University, Taiwan.
Artificial intelligence boosted with data-driven methods has surpassed human-level performance in various tasks. However, its application to autonomous systems still faces fundamental challenges such as lack of interpretability and intensive need for data. To address these challenges, this talk presents interpretable and data-efficient learning approaches that weave together theories and techniques in machine learning, formal methods and control theory. Different from traditional learning approaches, our approaches take into account the limited availability of simulated and real data, the uncertainties of the underlying model, and the expressivity and interpretability of high-level knowledge representations.
The first part of this talk focuses on learning high-level knowledge from data and its application in data-driven control. I present some state-of-the-art methods for learning informative temporal logic formulas from data. I further present a data-driven control method of robots in unknown environments that combines the learning of temporal logic formulas and the controller synthesis with the learned temporal logic formulas. The second part of this talk focuses on improving reinforcement learning with high-level knowledge. I present an approach to iteratively infer high-level knowledge (in the form of finite state machines with reward outputs) while the reinforcement learning proceeds, and the inferred high-level knowledge can effectively guide the future explorations of the learning agent. The experiments show that our approach can lead to fast convergence to optimal policies, while standard reinforcement learning methods fail to converge to optimal policies after a substantial number of training steps in many tasks.
Bio: Zhe Xu is a postdoctoral fellow at the Oden Institute for Computational Engineering and Sciences at the University of Texas at Austin. He received a Ph.D. degree in electrical engineering at Rensselaer Polytechnic Institute in 2018. He received the Howard Kaufman '62 Memorial Fellowship Award in 2016. His research interests lie in the area of formal methods, control theory and machine learning. He has developed various learning, control and verification methods with applications to robotics, power systems, smart buildings and biological systems.
At the heart of AI are the systems comprised of intelligent agents and the processes driven by these agents' interaction. Two types of such systems of prominent importance for the society are social and economic networks. In this talk, I will review my contributions to the studies of social and economic networks along the three dimensions of analysis, modeling, and control.
In the first part of the talk, I will review my work on data-driven model-based analysis as well as control of social networks. I will highlight the concept of the Social Network Distance (SND)---a distance measure that quantifies the likelihood of an observed change in the state of a large evolving social network with respect to a chosen model of agent state evolution. SND has Wasserstein metric at its core, yet, is computable in pseudo-linear time. It enables such applications as detecting anomalies---such as viral marketing campaigns---in time series of states of large social networks. I will also review my work on social network control and, more specifically, strategic link recommendation for defending social networks against influence maximization-like attacks. The core of this work is comprised of perturbation analysis of stationary distributions of Markov chains, and a pseudo-linear-time heuristic that solves the problem of defending a social network against external control.
In the second part of the talk, I will describe my work on economic networks, focusing on supply chain networks. In such networks, the agents are unreliable production units organized in tiers, where each tier functions as a market. The key question is whether rational self-interested agents will form a resilient supply chain network while functioning without coordination. I will show that, in the absence of supply caps in such networks, agents tend to concentrate links at Nash equilibrium, benefiting from supply variance, suggesting that competition can amplify uncertainty. When soft supply constraints are present, I show that the agents form expander-like equilibrium networks---known to be resilient---thereby, providing the first example of endogenous formation of resilient supply chain networks.
Finally, I will discuss the future work, including ensuring safety and fairness of algorithms in hybrid social networks comprised of humans and artificial agents; and the design of intelligent distributed markets aiming for efficiency and resilience.
Bio: Victor Amelkin is a Postdoctoral Fellow at Warren Center for Network & Data Sciences at the University of Pennsylvania. His research focuses on network science, with an emphasis on analysis, modeling, and control of social and economic networks, at the intersection of Computer Science, Engineering, Operations Research, Microeconomics, and Computational Social Science. Victor obtained his PhD from the Department of Computer Science at the University of California, Santa Barbara in 2018. Earlier, he had spent several years as a software engineer in industry. He holds MSc and BSc degrees in Applied Mathematics and Computer Science.
Abstract: Over the last decade, we have seen impressive progress across all areas of autonomy, including perception, control and learning. At the same time, developing safe and secure cyber-physical systems (CPS) remains a challenging task, as demonstrated by multiple recent accidents involving autonomous vehicles, drones and aircraft. In this talk, I will present an integrated approach to assuring the safety and security of CPS through a combination of offline verification and online monitoring techniques.
For offline assurance, I have developed an approach, called Verisig, for verifying the safety of autonomous systems with neural network controllers. I will present an exhaustive evaluation on a neural-network-controlled (1/10-scale) autonomous racing car, both in terms of verification and experiments on the real platform. In the second part of the talk, I will describe my work on run-time monitoring of system safety, with applications to medical CPS. Specifically, I will present a detector for critical drops in the patient's oxygen content during surgery, with guaranteed performance regardless of varying physiological parameters such as metabolism. The detector is evaluated on real-patient data collected from the Children's Hospital of Philadelphia.
Bio: Radoslav Ivanov received the B.A. degree in computer science and economics from Colgate University in 2011, and the Ph.D. degree in computer and information science from the University of Pennsylvania in 2017. He is currently a Postdoctoral Fellow with the University of Pennsylvania. His research interests are broadly in the field of safe and secure autonomy, with a focus on verified machine learning, control theory and cyber-physical security. The natural application domains of his work are automotive and medical cyber-physical systems
The recent success of machine learning relies heavily on the surge of big data, big models, and big computing. However, inefficient algorithms restrict the applications of machine learning to big data mining tasks. In terms of big data, serious concerns such as communication and data privacy should be rigorously addressed when we train models using amounts of data located on multiple devices. In terms of the big model, it is still an underexplored research area if a model is too big to train on a single device. To address the challenging problems in current big data science, we proposed several novel large-scale new asynchronous distributed stochastic gradient and coordinate descent methods for efficiently solving big data mining problems, and also scaled up the deep learning using model parallelism by parallelizing the layer-wise computation.
Bio: Zhouyuan Huo is a Ph.D. candidate in Computer Engineering at the University of Pittsburgh, working with Prof. Heng Huang. He received the B.E. degree in Electrical Engineering from Zhejiang University in 2014. His research interests lie at the intersection of Distributed Machine Learning, Deep Learning, Data Science, and Bioinformatics. He has published 21 academic papers in top-tier conferences, such as NeurIPS, ICML, KDD, EMNLP, AISTATS, IJCAI, AAAI, MICCAI, and ICDM.
The data economy is a rising ecosystem in which data are produced, distributed, and consumed at an unprecedented scale. On the one hand, the current data economy creates new levels of prosperity by driving rapid advances in machine learning and automation. On the other hand, it has some fundamental challenges that need to be addressed. First and foremost, how much is data worth? Data is valuable, yet a principled data valuation method is lacking. The answer to this question has profound implications: it will open up new data sources by facilitating and incentivizing data sharing and reduce economic inequality by allowing individuals to profit from their data. We also need to remain clear-eyed about the issues of privacy and security that arise from data-driven applications.
In this talk, I will mainly present a set of principled and efficient techniques for data valuation. These techniques can have broader applications beyond data pricing. For instance, they could be used to understand the importance of a single training point relative to others in machine learning and empower applications such as detecting noisy labels and defending against data poisoning attacks. I will also present my work in the space of privacy-preserving and trustworthy data analytics.
Bio: Ruoxi is currently a Postdoctoral Scholar in the Electrical Engineering and Computer Science (EECS) Department at UC Berkeley. She earned her PhD in the EECS Department from UC Berkeley in 2018 and a B.S. from Peking University in 2013. Ruoxi’s research interest lies broadly in the span of machine learning, security, privacy, and cyber-physical systems. Her recent work is focused on developing theoretical foundation and practical algorithms for improving the fairness, privacy, and security of the data economy. Ruoxi is the recipient of several fellowships, including the Chiang Fellowship for Graduate Scholars in Manufacturing and Engineering, the 8108 Alumni Fellowship, and the Okamatsu Fellowship. She was selected for the Rising Stars in the EECS program in 2017. Ruoxi’s work has been featured in multiple media outlets, including MIT Technology Review, IEEE Spectrum, and Synced.
Abstract: In this talk, I would like to present some recent works on fair division of indivisible goods under connectivity constraints. Agents have some preferences over the bundles and they are only interested in receiving a connected bundle of items. This model is particularly relevant when the items have a spatial or temporal structure. We show that when the graph is a path, envy-freeness up to one good (EF1) allocations are guaranteed to exist for arbitrary monotonic utility functions over bundles, provided that either there are at most four agents, or there are any number of agents but they all have identical utility functions. Our existence proofs are based on classical arguments from the divisible cake-cutting setting, and involve discrete analogues of cut-and-choose, of Stromquist's moving-knife protocol, and of the Su-Simmons argument based on Sperner's lemma. For the case of two agents, we completely characterize the class of graphs G that guarantee the existence of EF1 allocations as the class of graphs whose biconnected components are arranged in a path. This talk is based on the paper published in ITCS 2019 .
 V. Bilò, I. Caragiannis, M. Flammini, A. Igarashi, G. Monaco, D. Peters, C. Vinci, W. S. Zwicker, Almost Envy-free Allocations with Connected Bundles, ITCS 2019.
Short bio: Ayumi Igarashi is a JSPS postdoctoral fellow, hosted by Prof. Satoru Iwata at University of Tokyo since April 2019. She completed her PhD at the Department of Computer Science, the University of Oxford under the supervision of Prof. Edith Elkind. Her broad area of research interests includes algorithmic game theory, combinatorial optimisation, and social choice. She has been particularly looking at problems with connectivity constraints imposed by a network, especially the existence and complexity of fairness and stability notions in various contexts, such as coalition formation games and fair division of indivisible items
Even in the age of big data and machine learning, human knowledge and preferences still play a large part in decision making. For some tasks, such as predicting complex events like recessions or global conflicts, human knowledge remains the state of the art, unsurpassed by algorithms and statistical models. In other cases, a decision maker is tasked with utilizing human preferences to, for example, make a popular decision over an unpopular one. However, while often useful, eliciting data from humans poses significant challenges. First, humans are strategic, and may misrepresent their private information if doing so can benefit them. Second, when decisions affect humans, we often want outcomes to be fair, not systematically favoring one individual or group over another.
In this talk, I discuss two settings that exemplify these considerations. First, I consider the participatory budgeting problem in which a shared budget must be divided among competing public projects. Building on classic literature in economics, I present a class of truthful mechanisms and exhibit a tradeoff between fairness and economic efficiency within this class. Second, I examine the classic online learning problem of learning with expert advice in a setting where experts are strategic and act to maximize their influence on the learner. I present algorithms that incentivize truthful reporting from experts while achieving optimal regret bounds.
Bio: Rupert Freeman is a postdoc at Microsoft Research New York City. Previously, he received his Ph.D. from Duke University under the supervision of Vincent Conitzer. His research focuses on the intersection of artificial intelligence and economics, particularly in topics such as resource allocation, collective decision making, and information elicitation. He is the recipient of a Facebook Ph.D. Fellowship and a Duke Computer Science outstanding dissertation award.
Today, the widespread consensus is that besides data centers at the cloud, machine learning tasks have to be performed starting from the network edge, namely smart devices. In this context, we will highlight key challenges in learning at the network edge, including communication overhead and heterogeneity. We will introduce a class of novel methods that can reduce communication overhead for solving collaborative learning problems in heterogeneous settings. Our methods are simple to implement. They rely on an adaptive rule designed to detect slowly-varying information and, therefore, trigger the reuse of outdated information, and they come with rigorous performance guarantees in terms of convergence and communication reduction. We will corroborate analytical guarantees with impressive empirical tests on standard learning benchmarks.
Bio: Tianyi Chen received the B. Eng. degree in Communication Science and Engineering from Fudan University in 2014, the M.Sc. and Ph.D degrees in Electrical and Computer Engineering (ECE) from the University of Minnesota (UMN), in 2016 and 2019, respectively. Since August 2019, he is with Department of Electrical, Computer, and Systems Engineering at Rensselaer Polytechnic Institute as an Assistant Professor. During 2017-2018, he has been a visiting scholar at Harvard University, University of California, Los Angeles, and University of Illinois Urbana-Champaign. His research interests lie in optimization, signal processing, and machine learning with applications to large-scale networked systems such as communication, computing systems, and energy systems. He was a Best Student Paper Award finalist in the 2017 Asilomar Conf. on Signals, Systems, and Computers. He received the National Scholarship from China in 2013, the UMN ECE Department Fellowship in 2014, and the UMN Doctoral Dissertation Fellowship in 2017.
The study of spreading processes has been a topic of interest for many years over a wide range of areas, including computer science, mathematical systems, biology, physics, social sciences, and economics. More recently, there has been a resurgence of interest in the study of spreading processes focused on spread over networks, motivated not only by security threats posed by computer viruses, but also recent devastating outbreaks of infectious diseases and the rapid spread of opinions and misinformation over social networks. Up to this point these network-dependent spread models have not been validated by real data.
In this talk, I present and analyze mathematical models for network-dependent spread, providing necessary and sufficient conditions for uniqueness of the healthy state and prove the existence of an endemic equilibrium. I use the endemic equilibrium results to estimate the healing and infection parameters of the model. Using this technique I validate the spread model by employing John Snow's classical work on cholera epidemics in London in the 1850's, with surprisingly good results. Given the demonstrated validity of the model, I discuss several control strategies for mitigating spread, and formulate a tractable antidote administration problem that significantly reduces spread. Motivated by competing USDA farm subsidy programs, I present and analyze a model for multiple competing viruses spreading over multi-layered networks. Further, I present a technique for estimating the healing and infection parameters of the model from time series data. I validate the multi-virus model by applying this technique to the USDA farm subsidy dataset, and improve upon the results by employing an online learning algorithm. Finally, motivated by recent contaminated water-related outbreaks in Scandinavia, I propose a layered infrastructure network model and explore when the source of an outbreak can be detected. I conclude by discussing various directions for compelling future work, including combining the online learning algorithms with the control techniques to develop data-driven mitigation strategies.
The rapid development of low-cost sensors, smart devices, communication networks, and learning algorithms has enabled data driven decision making in large-scale multi-agent systems. Prominent examples include mobile robotic networks and autonomous systems. The key challenge in these systems is in handling the vast quantities of information shared between the agents in order to find an optimal policy that maximizes an objective function. Among potential approaches, distributed reinforcement learning, which is not only amenable to low-cost implementation but can also be implemented in real time, has been recognized as an important approach to address this challenge.
The focus of this talk is to consider the policy evaluation problem in multi-agent reinforcement learning, one of the most fundamental problems in this area. In this problem, a group of agents operate in an unknown environment, where their goal is to cooperatively evaluate the global discounted accumulative reward composed of local rewards observed by the agents. For solving this problem, I consider a distributed variant of the popular temporal difference learning, often referred to as TD(
Bio: Thinh T. Doan is a TRIAD postdoc fellow at the Georgia Institute of Technology, joint between the School of Electrical and Computer Engineering (ECE) and the School of Industrial and Systems Engineering. He was born in Vietnam, where he got his Bachelor degree in Automatic Control at Hanoi University of Science and Technology in 2008. He obtained his Master and Ph.D. degrees both in ECE from the University of Oklahoma in 2013 and the University of Illinois at Urbana-Champaign (UIUC) in 2018, respectively. At Illinois, he was the recipient of the Harriett & Robert Perry Fellowship in ECE in 2016 and 2017. His research interests lie at the intersection of control theory, optimization, distributed algorithms, and applied probability, with the main applications in machine learning, reinforcement learning, and multi-agent systems.
In security games, a defender commits to a mixed strategy for protecting a set of n targets of values v_i; an attacker, knowing the defender's strategy, chooses which target to attack and for how long. We study a natural version in which the attacker's utility grows linearly with the time he spends at the target, but drops to 0 if he is caught.
The defender's goal is to minimize the attacker's utility.
The defender's strategy cosists of a schedule for visiting the targets; a metric space determines how long it takes to travel between targets.
Such games naturally model a number
of real-world scenarios, including protecting computer networks from intruders, animals from poachers, etc.
We study this problem in two scenarios:
1. When all the distances between targets are the same, then optimal and near-optimal strategies naturally correspond to sequences in which each charecter occupies a prescribed fraction p_i of the sequence, and appears "close to evenly spread." We formalize these notions, and give efficient algorithms for finding sequences whose properties are strong enough to provide optimal or near-optimal strategies in the corresponding security game.
2. When distances between targets are arbitrary, the problem subsumes the well-known Traveling Salesman Problem (TSP), but generalizes it in novel and interesting ways. We give a polynomial-time 0(log n) approximation algorithm for finding the best strategy for the defender
Bio: David Kempe received his Ph.D. from Cornell University in 2003, and has been on the faculty in the computer science department at USC since the Fall of 2004, where he is currently an Associate Professor. His primary research interests are in computer science theory and the design and analysis of algorithms, with a particular emphasis on social networks, topics at the intersection of economics and computing, and algorithmic learning questions. He is a recipient of the NSF CAREER award, the VSoE Junior Research Award, the ONR Young Investigator Award, a Sloan Fellowship, and an Okawa Fellowship, in addition to several USC mentoring awards.
Refreshments at 3:30 pm
In this talk we discuss an ongoing research on computational modeling and understanding of social, behavioral, and cultural phenomena in multi-party interactions. We first describe the background work in the project that pioneered a novel approach to social computing, as part of the IARPA SCIL program. We demonstrate how various linguistic cues reveal the social dynamics in group interactions, based on a series of experiments conducted in virtual on-line chat rooms, and then show that these dynamics generalize to other forms of communication including traditional face-to-face discourse as well as the large-scale online interaction via social media.
We outline several extensions of the basic socio-behavioral model that move beyond recognition and understanding of social dynamics and attempt to quantify and measure the effects that sociolinguistic behaviors by individuals and groups have on other discourse participants, including potential for manipulating online information cascades, and related persuasion campaigns. We discuss a possibility of constructing autonomous artificial agents capable of exerting influence and manipulating human behavior in certain situations. We then look into ways to build defenses against such manipulation, specifically focusing on social engineering attacks.
This research has been supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), the Defense Advanced Research Project Agency (DARPA), the Army Research Laboratory (ARL), and the Air Force Research Laboratory (AFRL). Additional funding was provided by the Lockheed Martin Corporation.
Bio: Dr. Tomek Strzalkowski is Professor of Cognitive Science at RPI. Prior to joining RPI this past summer, he was Director of SUNY Albany’s ILS Institute (www.ils.abany.edu) and Professor of Computer Science for nearly 20 years. Between 1994 and 2000 he was a Principal Scientist at GE R&D in Niskayuna, directing the Natural Language Group. Before coming to GE, he was research faculty at the Courant Institute of New York University. Dr. Strzalkowski has done research in computational linguistics and sociolinguistics, information retrieval, question-answering, human-computer dialogue, serious games, social media analytics, formal semantics, and reversible grammars. He has directed research sponsored by IARPA, DARPA, ARL, AFRL, NSF, the European Commission, NSERC, as well as a number of industry-funded projects. He was involved in IBM’s Jeopardy! Challenge in advanced question answering. Dr. Strzalkowski has published over a hundred and fifty scientific papers, and is the editor of several books, including Advances in Open Domain Question Answering. He serves on the Editorial Board of the journal of Natural Language Engineering.
Healthcare is on the verge of a revolution driven by ever-increasing knowledge and heterogeneous big data. With advances in machine learning and data mining, diagnostic standards, medical genomics, and treatment matching and personalization have been changing rapidly. On the other hand, challenges in medical informatics motivate and demand the development of novel machine learning principles, models, and algorithms. In this talk, we will touch upon several medical topics where machine learning methods have helped the knowledge discovery from complex and massive data. We will then focus on the discussion of one of our recent works on multi-task learning, and also demonstrate how we study the theoretical properties of these innovative approaches and their practical performance.
Bio: Jinbo Bi is a mathematician, computer scientist and educator who has pursued her career as a medical informatics investigator. She is currently a Professor at the Department of Computer Science and Engineering in the University of Connecticut. She has had 20 years of experience in developing machine learning approaches to address life science challenges. First in industry (until 2010), she conducted cutting-edge research for cancer detection from imaging modalities at Siemens and clinical decision support for trauma patient care at Massachusetts General Hospital. Then in an academic setting at the University of Connecticut, she has been designing innovative and multidisciplinary approaches for computational genomics and computational neuroscience to refine classification of complex disorders. She has authored over 100 peer-reviewed articles in leading journals and conferences in the respective fields with best paper awards from IEEE, and continuously served as senior program committee member for ICML, NeurIPS, KDD and AAAI, and the general chair for IEEE BIBM in 2019. She serves in the advising committee for NIH Biostatistics division (NIAAA/NIDA). She is a recipient of NIH mid-career Independent Scientist Award in 2017, and a recipient of Connecticut Women of Innovation award in 2019.
Refreshments 10:30 am
The rise of online platforms has the potential to revolutionize traditionally small-group endeavors, such as informed decision-making and collaborative work, by allowing them to operate in a distributed manner at much larger scales. In this talk, I will discuss my work on the design of appropriate mechanisms to fully leverage this development, and some exciting avenues for future research.
I will describe my work on Participatory Budgeting (as part of the Stanford Crowdsourced Democracy Team): in particular, Knapsack Voting, a novel voting method for this setting. I will then demonstrate how this scheme works in practice, and share theoretical and empirical insights as to how it performs better than conventional methods.
Despite the ever-increasing use of AI to make societal decisions, a study of the fairness properties of preference aggregation methods has hitherto remained lacking. In this regard, I will describe a general framework for quantifying the fairness properties of aggregation methods based on ideas from multi-objective optimization. I will also briefly describe some of my work on studying platforms for crowdsourced collaboration, and the design of suitable incentive and learning mechanisms to improve their efficacy.
Bio: Anilesh Krishnaswamy is a PhD student at Stanford University, advised by Prof. Ashish Goel. His research looks at problems in collective intelligence, in particular, those pertaining to online platforms for democratic decision-making, social computing, and labor markets. He is one of the designers of the Stanford Participatory Budgeting Platform (see link below), a digital voting tool that is used by dozens of cities across the US to decide the fate of millions of dollars of public funds. He was also a visiting researcher at Tel-Aviv University during the summer of 2018. Prior to graduate studies at Stanford, he obtained a bachelor’s degree in Electrical Engineering from the Indian Institute of Technology Madras.
Refreshments 10:30 am
Deep learning has been achieving very impressive improvement in many applications such as computer vision. In the literature, however, many theoretical and empirical questions still remain elusive. Why does deep learning work so well? How should we design a network for a specific task (in a principal way)? Can we train a network better and faster? Can we learn a network that can be deployed in embedded systems? To answer such questions, in this talk I will provide some insight on three important issues in deep learning, i.e., efficiency, interpretability, and robustness, from the perspective of optimization. Specifically in my talk, efficiency aims to accelerate the computation of deep models as well as preserving their high accuracy. Interpretability aims to understand the physical meaning of the training objectives in deep learning (including the network architectures). Robustness aims to analyze the convergence and generalization of deep models. I will present some algorithms and results from my recent works on deep learning, some of which have been applied to real products in computer vision and geoscience, and I hope that my works can continue to contribute to other communities.
Biography: Dr. Ziming Zhang is currently a Principal Research Scientist at Mitsubishi Electric Research Laboratories (MERL). Before joining MERL he was a Research Assistant Professor at Boston University in 2015-2016. He completed his PhD from Oxford Brookes University, U.K., in 2013 under the supervision of Prof. Philip Torr (now a professor at University of Oxford). His research interest lies in computer vision and machine learning, including object recognition and detection, person re-identification, zero-shot learning, optimization, deep learning, etc. His works have appeared in TPAMI, CVPR, ICCV, ECCV and NIPS. He serves as a reviewer/PC member in the top-tier conferences such as CVPR, ICML, NIPS, AAAI, AISTats, IJCAI, UAI and ICLR. He won R&D100 Award, 2018.
Refreshments at 10:30 am
Abstract: Over the past decade, machine learning methods such as deep neural networks have made a huge impact on a variety of complex tasks. On the other hand, very little is understood about when and why these ML methods work in practice. Bridging this gap requires better understanding of the non-convex optimization paradigm commonly used in training deep neural networks, as well as better modeling of real world data. My research aims at providing theoretical foundations and better algorithms to this emerging domain.
This talk will show a few results. First, we study non-convex methods and their generalization performance (or sample efficiency) on common ML methods such as over-parameterized matrix models. Our result highlights the role of the optimization algorithm in explaining generalization when there are more trained parameters than the size of the dataset, and the benefit to optimization by adding more parameters. Next, we consider the problem of predicting the missing entries of high-dimensional tensor data. We show that understanding generalization can inform the choice of tensor models depending on the amount of data available. Lastly, I will briefly mention my other work on designing efficient algorithms to deal with large-scale graph data.
Bio: Hongyang Zhang is a Ph.D. candidate in CS at Stanford University, co-advised by Ashish Goel and Greg Valiant. His research efforts lie in machine learning and algorithms. His expertise includes neural network analysis, non-convex optimization, social network algorithms and game theory. He is a recipient of the best paper award at Conference on Learning Theory 2018.
Teams of agents must often assign tasks among themselves and plan collision-free paths to the task locations. Examples include autonomous aircraft towing vehicles, automated warehouse systems, office robots, and game characters in video games. In the near future, autonomous aircraft towing vehicles will tow aircraft all the way from the runways to their gates (and vice versa), thereby reducing pollution, energy consumption, congestion, and human workload. Today, hundreds of robots already navigate autonomously in Amazon fulfillment centers to move inventory pods all the way from their storage locations to the inventory stations that need the products they store. Coordination problems for these teams of agents are NP-hard in general, yet we must make fast and good decisions when it comes to assigning tasks and finding collision-free paths for these teams of agents. Better task assignment and shorter paths result in higher throughput or lower operating costs (since fewer robots are required). In this context, (1) I will describe several versions of task-assignment and path-planning problems for teams of agents, their complexities, algorithms for solving them, and their applications; and (2) I will present an algorithmic framework that unifies tools and techniques from artificial intelligence, robotics, operations research, and theoretical computer science to coordinate long-term operations for real-world multi-agent systems at a scale of hundreds of agents and thousands of tasks.
Biography: Hang Ma is a Ph.D. candidate at the University of Southern California. He holds a B.Sc. (First Class with Distinction) in Computing Science from Simon Fraser University, a B.Eng. in Computer Science and Technology from Zhejiang University (jointly with Simon Fraser University), and an M.Sc. in Computer Science from McGill University. His research interests lie in artificial intelligence, robotics, and machine learning, with a focus on decision making for multi-agent systems. He is a winner of the Outstanding Paper Award in the Robotics Track of the International Conference on Automated Planning and Scheduling 2016, a Commercialization Award from the USC Stevens Center for Innovation, and a USC Annenberg Graduate Fellowship. More information can be found on his webpage.
Refreshments at 3:30 pm
Different from traditional machine learning, interactive learning allows a learner to be involved in the data collection process through interacting with the environment. An interactive learner can carefully avoid collecting redundant information, thus being able to make accurate predictions with a small amount of data. Two key questions in interactive learning research are of central interest: first, can we design and analyze interactive learning algorithms that have data efficiency, computational efficiency and robustness guarantees? Second, can we identify novel interaction models which learners can benefit from? In this talk, I will answer both questions in the affirmative. In the first part of the talk, I will present our work on efficient noise-tolerant active learning of linear classifiers with near-optimal label requirements. In the second part, I will discuss a new interactive learning model, namely warm-starting contextual bandits, and present an algorithm in this model with robustness guarantees. I will conclude my talk by outlining several promising directions for future research.
Bio: Chicheng Zhang is a postdoctoral researcher at the machine learning group at Microsoft Research New York City. He received a Bachelor degree from Peking University in 2012 and a PhD in Computer Science from UC San Diego in 2017. Chicheng has broad research interests in machine learning, including but not limited to active learning, contextual bandits, unsupervised learning and confidence-rated prediction. His main research interests lie in the design and analysis of interactive machine learning algorithms.
Refreshments 3:30 pm
Many important problems involve decision making under uncertainty. Namely, choosing options based on imperfect observations, with unknown outcomes. Multi-armed bandit is a simple yet quite useful framework of the algorithmic decision making under uncertainty. A decision-maker faces an “exploration vs. exploitation” dilemma. On the one hand, the decision-maker attempts to exploit the information that he or she has gathered up to now. On the other hand, the decision-maker desires information to improve the quality of their decision in the future. The framework has many practical applications such as the cold-start problem in a recommendation system and game-tree search in abstract games. A widely-used idea for this framework is to use the upper confidence bound (UCB), which gives additional value on options of high uncertainty. For instance, the state-of-the-art Game-of-Go algorithms, such as AlphaGo by Deepmind, adapts some variant of UCB.
In this talk, I will address two of my major challenges in this topic. (i) The efficiency of the Thompson sampling (TS), the oldest heuristic algorithm proposed in the 1930s: Regarding this topic, we show that TS is as efficient as the best version of UCB. (ii) The limitation of value function based approaches: I derive that UCB and TS are not always efficient by showing some examples where these value function approaches are not successful. I show that the efficiency of information can be represented as an optimization problem corresponding to the model structure, and invent an algorithm based on the optimization.
Bio: Junpei Komiyama is a research associate at the University of Tokyo, Institute of Industrial Science. He received a Ph.D. from the University of Tokyo in 2016. He belongs to machine learning and data mining communities. His research interest lies in data-driven decision making, and the research topics include multi-armed bandits, experimental design, statistical testing, optimization, computational economics, and algorithmic fairness.
Refreshments 3:30 pm
Recent years have seen data-driven algorithms deployed in increasingly high-stakes environments. These algorithms are often highly complex, making them effectively “black boxes”; this potentially exposes various stakeholders (such as end-users, or the agencies deploying them) to risks, such as unfair treatment or inadvertent data breaches. In response, government agencies and professional societies have highlighted fairness and transparency as key design paradigms in AI/ML applications.
In this talk I will discuss our recent work on the mathematical foundations of algorithmic transparency and fairness. From the transparency perspective, I will discuss how we design transparency measures that are guaranteed to satisfy certain natural desiderata; in addition, I will discuss a recent line of work showing how some natural transparency measures may be used by an adversary in order to extract private user information. Regarding fairness, I will discuss how we apply fairness paradigms to algorithms, in particular our work on designing and deploying fair allocation algorithms; our results show that humans respond well to provably fair algorithms, and are willing to collaborate effectively even in strategic domains. Finally, I will discuss how we apply learning-theoretic approaches to fairness via a novel paradigm for adapting game-theoretic solution concepts in data-driven domains.
Bio: Yair Zick is an assistant professor at the Department of Computer Science at the National University of Singapore. He obtained his PhD (mathematics) from Nanyang Technological University, Singapore in 2014, and a B.Sc (mathematics, "Amirim" honors program) from the Hebrew University of Jerusalem. His research interests include computational fair division, computational social choice, algorithmic game theory and algorithmic transparency. He is the recipient of the 2011 AAMAS Best Student Paper award, the 2014 Victor Lesser IFAAMAS Distinguished Dissertation award, the 2016 ACM EC Best Paper award, and the 2017 Singapore NRF Fellowship.
Refreshments 3:30 pm
Recent advances in machine learning have spawned innovation and prosperity in various fields. In machine learning models, nonlinearity facilitates more flexibility and ability to better fit the data. However, the improved model flexibility is often accompanied by challenges such as overfitting, higher computational complexity, and less interpretability. Thus, my research has been focusing on designing new feasible nonlinear machine learning models to address the above different challenges posed by various data scales, and bringing new discoveries in both theory and applications. In this talk, I will introduce my newly designed nonlinear machine learning algorithms, such as additive models and deep learning methods, to address these challenges and validate the new models via the emerging biomedical applications.
First, I introduce new interpretable additive models for regression and classification, and address the overfitting problem of nonlinear models in small and medium-sized data. I derive the model convergence rate under mild conditions in the hypothesis space, and uncover new potential biomarkers in Alzheimer's disease study. Second, I propose a deep generative adversarial network to analyze the temporal correlation structure in longitudinal data, and achieve state-of-the-art performance in Alzheimer's early diagnosis. Meanwhile, I design a new interpretable neural network model to improve the interpretability of the results of deep learning methods. Further, to tackle the insufficient labeled data in large-scale data analysis, I design a novel semi-supervised deep learning model and validate the performance in the application of gene expression inference.
Bio: Xiaoqian Wang is a Ph.D. candidate in Computer Engineering at the University of Pittsburgh. She received the B.S. degree in Bioinformatics from Zhejiang University in 2013. Her research interests span across multidisciplinary areas of machine learning, data mining, computational neuroscience, cancer genomics, and precision medicine. She has published 21 papers in top-tier conferences such as NIPS, ICML, KDD, IJCAI, AAAI, RECOMB, and ECCB. She received the Best Research Assistant Award at the University of Pittsburgh in 2017.
Refreshments 3:30 pm
Most people interact with machine learning systems on a daily basis. Such interactions often happen in strategic environments where people have incentives to manipulate the outcome of the learning algorithms. As machine learning plays a more prominent role in our society, it is important to understand how people's incentives can affect the learning problems, and to design reliable algorithms that perform well in these strategic environments.
Bio: Yu Cheng is a postdoctoral researcher in the Department of Computer Science at Duke University, hosted by Vincent Conitzer, Rong Ge, Kamesh Munagala, and Debmalya Panigrahi. He obtained his Ph.D. in Computer Science from the University of Southern California in 2017, advised by Shang-Hua Teng. His research interests include machine learning, game theory, and optimization.
Refreshments 3:30 pm
Transfer learning aims to mimic human cognitive process to adapt previous well-learn knowledge to facilitate the new challenging learning tasks. In this presentation, I will briefly summarize transfer learning in visual recognition tasks and focus on two specific face recognition topics, i.e., missing modality and one-shot learning. First, we always confront the problem that we have no face samples of target modality available in the training stage, which arises when the face data are multi-modal. To overcome this, we first borrow an auxiliary database with complete modalities, then propose a two-directional knowledge transfer to solve the missing modality issue. Second, there is always a challenge that only one sample of some persons are accessible in the training process. It is very difficult for existing learning approaches to handle it, since limited data cannot well represent the data variance. Thus, we develop a novel generative model to synthesize meaningful data for one-shot persons by adapting the data variances from other normal persons.
Biography: Zhengming Ding received the B.Eng. degree in information security and the M.Eng. degree in computer software and theory from University of Electronic Science and Technology of China (UESTC), China, in 2010 and 2013, respectively. He received the Ph.D. degree from the Department of Electrical and Computer Engineering, Northeastern University, USA in 2018. He has been a faculty member affiliated with Department of Computer, Information and Technology, Indiana University-Purdue University Indianapolis since 2018. His research interests include transfer learning, multi-view learning and deep learning. He received the National Institute of Justice Fellowship during 2016-2018. He was the recipients of the best paper award (SPIE 2016) and best paper candidate (ACM MM 2017). He is currently an Associate Editor of the Journal of Electronic Imaging (JEI).
How can we intelligently acquire information for decision making, when facing a large volume of data? In this talk, I will focus on learning and decision making problems that arise in robotics, scientific discovery and human-centered systems, and present how we can develop principled approaches that actively extract information, identify the most relevant data for the learning tasks and make effective decisions under uncertainty. As an example, I will introduce the optimal value of information problem for decision making, and show that for a large class of adaptive information acquisition problems that are known to be NP-hard, one could devise efficient surrogate objectives that are amenable to greedy optimization, while still achieving strong approximation guarantees. I will further talk about a few practical challenges in real-world decision-making systems such as complex constraints, complex action space, and rich interfaces. More concretely, I will elaborate on how to address these practical concerns through a variety of applications, ranging from sequential experimental design for scientific discovery to interactive machine teaching for human learners.
Yuxin Chen is a postdoctoral scholar in the Department of Computing and Mathematical Sciences at the California Institute of Technology (Caltech). Prior to Caltech, he received his Ph.D. in computer science from ETH Zurich in 2017. His research interest lies broadly in probabilistic reasoning and machine learning. He was a recipient of the Google European Doctoral Fellowship in Interactive Machine Learning, the Swiss SNSF Early Postdoc.Mobility Fellowship, and the PIMCO Postdoctoral Fellowship in Data Science. He currently focuses on developing interactive machine learning systems that involve active learning, sequential decision making, interpretable models and machine teaching.
Over the last decades, efficient computational models and methods have become important tools for analyzing and solving various societal problems concerning decision-making agents in real-world systems. In this talk, I will present decision-theoretic models and tools for solving problems in security, homelessness, and crowdsourcing domains. In the first part of this talk, I will present several important computational game-theoretic frameworks for real-world settings of security, including computational methods for computing Nash equilibria and machine learning (ML) tools for estimating the parameters and structure of these games. In the second part of this talk, I will discuss the application of ML and algorithmic techniques to allocate housing resources to homeless youth and highlight the application of AI and data mining techniques to reduce HIV among homeless youth. Finally, I will briefly discuss ongoing and future work in these domains, including an exciting work on designing and evaluating crowdsourcing systems.
Bio: Hau Chan is an assistant professor in the Department of Computer Science and Engineering at the University of Nebraska-Lincoln. His current research aims to address the modeling and computational aspects of societal problems (e.g., game-theoretic models for security domains, housing allocations for homeless youth, and market designs for crowdsourcing contests) from AI, data mining and machine learning perspectives. He received his Ph.D. in computer science from Stony Brook University in 2015 and completed three years of postdoctoral fellowships, most recently at the Laboratory for Innovation Science at Harvard University in 2018. He is a recipient of an NSF Graduate Research Fellowship, an NSF East Asia and Pacific Summer Fellowship, a 2015 SDM Best Paper Award, a 2016 AAMAS Best Student Research Paper Award, and a 2018 IJCAI Distinguished PC Member recognization.
Refreshments 3:30 pm
Abstract: In the recent few years, huge advances have been made in machine learning, which has transformed many fields such as computer vision, speech processing, and even games. A key "secret sauce" in the success of these models is the ability of certain architectures to learn good representations of complex data; that is, preprocessing the data in a way to make the optimization either more robust or more efficiently solvable.
We investigate three instances where encoding structure in the feature variable facilitates optimization. In the first case, we look at word vectors as language-modeling mathematical constructs, and their use in data mining applications. Then, we consider encoding data used in distributed optimization for robustness against network delays and failures. Finally, we investigate the curious ability of proximal methods to quickly identify sparsity patterns in an optimization variable, which greatly facilitates feature selection. These three projects span several key areas of the growing field of machine learning, focusing on the optimization difficulties and curiosities in each area.
Bio: Yifan Sun got her PhD in Electrical Engineering from UCLA in 2015. She worked at Technicolor Research in Palo Alto, California, for two years post graduate schools, focusing on machine learning projects. She is now a postdoctorate at the University of British Columbia in Vancouver. Her research interests are convex optimization, semidefinite optimization, first-order and stochastic methods, and machine learning interpretability.
Refreshments served at 3:30pm.
Nonlinear stochastic optimization problems arise in a wide range of applications, from acoustic/geophysical inversion to deep learning. The scale, computational cost, and difficulty of these models make classical optimization techniques impractical. To address these challenges, we have developed new optimization methods that, in addition, are well suited for distributed computing implementations. Our techniques employ adaptive sampling strategies that gradually increase the accuracy in the step computation in order to achieve efficiency and scalability, and incorporate second-order information by exploiting the stochastic nature of the problem. We provide some interesting (and perhaps surprising) complexity results for our methods. The performance of our algorithm is illustrated on large-scale machine learning models, both convex and non-convex. We conclude by highlighting some open questions that arise when training deep neural networks.
Bio: Raghu Bollapragada is a PhD candidate in the Industrial Engineering and Management Sciences department at Northwestern University under the supervision of Professor Jorge Nocedal. During his graduate study, he was a visiting researcher at INRIA, Paris, hosted by Professor Alexandre d’Aspremont. He received his Bachelor’s and Master’s Dual Degree in Mechanical Engineering with minor in Operations Research from the Indian Institute of Technology (IIT) Madras, India. His current research interests are in nonlinear optimization and its applications in machine learning. He has received the IEMS Arthur P. Hurter Award for outstanding academic excellence, the McCormick terminal year fellowship for outstanding terminal-year PhD candidate, and the Walter P. Murphy Fellowship at Northwestern University.
Abstract: Network science have been focused on the properties of a single isolated network that does not interact or depend on other networks. In reality, many real-networks, such as power grids, transportation and communication infrastructures interact and depend on other networks. I will present a framework for studying the vulnerability and the recovery of networks of interdependent networks. In interdependent networks, when nodes in one network fail, they cause dependent nodes in other networks to also fail. This is also the case when some nodes like certain locations play a role in two networks--multiplexing. This may happen recursively and can lead to a cascade of failures and to a sudden fragmentation of the system. I will present analytical solutions for the critical threshold and the giant component of a network of n interdependent networks. I will show that the general theory has many novel features that are not present in the classical network theory. When recovery of components is possible, global spontaneous recovery of the networks and hysteresis phenomena occur and the theory suggests an optimal repairing strategy for system of systems. I will also show that interdependent networks embedded in space are significantly more vulnerable compared to non-embedded networks. In particular, small localized attacks may lead to cascading failures and catastrophic consequences. Thus, analyzing data of real network of networks is highly required to understand the system vulnerability.
Bio: Shlomo Havlin is a professor in the Department of Physics at Bar-Ilan University, Ramat-Gan, Israel. He served as the president of the Israel Physical Society from 1996 to 1999, as dean of Faculty of Exact Sciences from 1999 to 2001, and as chairman at the Department of Physics from 1984 to 1988. He has made fundamental contributions to statistical physics and contributed to multidisciplinary fields such as complex networks, climate, medicine, biology, geophysics, and computer science. Shlomo published more than 750 scientific papers including over 33 Nature and PNAS papers and 68 Phys. Rev. Lett. His papers are highly cited. Ten of his papers are between the top one percent highest cited papers in the last ten years. In 2017 he was one of the two most cited scientists in Israel. Shlomo has been a fellow of the American Physical Society since 1995 and a fellow of the Institute of Physics in England since 2000. He was head of the Excellence National Network Center supported by the Israel Science Foundation from 1990 to 2011 and head of the Minerva Center from 1994 to 2011. He is in the Editorial Boards of Physica A, New Journal of Physics, Fractals and Co-Editor of Europhysics Letters. Shlomo obtained the Landau Prize for outstanding Research (Israel, 1988), the Humboldt Senior Scientist Award (Germany, 1992), the Nicholson Medal of the American Physical Society (USA, 2006), the Weizmann Prize (Israel, 2009), the Lilienfeld Prize for “a most outstanding contribution to physics” of the American Physical Society (USA, 2010), the Rothschild Prize in Physical and Chemical Sciences (Israel 2014), the Order of the Star of Italy by the President of Italy (2017), and the Israel Prize in Physics and Chemistry (2018).
Refreshments served at 1:30pm.
Dynamic graph mining can elucidate the activity of in-network processes in diverse application domains
from social, mobile and communication networks to infrastructure and biological networks. Compared to
static graphs, the temporal information of when graph events occur is an important new dimension for improving
the quality, interpretation and utility of mined patterns. However, mining dynamic graphs poses an important,
though often overlooked, challenge: observed data must be analyzed at an appropriate temporal resolution
(timescale), commensurate with the underlying rate of application-specific processes. If the temporal resolution
is too high, evidence for ongoing processes may be fragmented in time; if it is too low, data relevant to multiple ongoing
processes may be mixed, thus obstructing discovery. Existing approaches for dynamic graph mining typically adopt
a fixed timescale (e.g., minutes, days, years), and they mine for patterns in the corresponding aggregated graph
snapshots. However, timescale-aware methods must consider non-uniform resolution across both time and the
graph, and thus, account for heterogeneous network processes evolving at varying rates in different graph regions.
In this talk I will discuss our recent works on detecting an appropriate timescale for community activity and
information propagation in socials networks.
Bio: Petko Bogdanov is an Assistant Professor at the computer science department of University at Albany –SUNY. His research interests include data mining and management with a focus on graph data and applications to bioinformatics, neuroscience, material informatics and sociology. Previously, he was a postdoctoral fellow at the department of computer science at University of California, Santa Barbara. He received his PhD and MS in Computer Science from the University of California, Santa Barbara in 2012 and his BE in Computer Engineering from Technical University of Sofia in 2005. Dr. Bogdanov is a member of the IEEE and the ACM and his research has been sponsored by grants from NSF, DARPA and ONR.
Abstract: Many real-life situations involve dividing a set of resources among agents in an impartial or equitable manner. Fair division is the formal study of such situations, accounting for the fact that different agents might have different preferences over the goods that need to be divided. In this talk, we will discuss the fair division of discrete (or indivisible) goods, and present a recent result showing that the seemingly incompatible notions of fairness and economic efficiency can be achieved simultaneously.
Bio: Rohit Vaish is a postdoctoral research associate in Prof. Lirong Xia's group at the Department of Computer Science at RPI. Prior to that, he did his PhD from Indian Institute of Science. His research lies at the intersection of computer science and economics, including topics such as voting, stable matching, fair division, and kidney exchange.
The explosion of information age has thrust us into a new, interconnected, and uncharted future. We send data in thinly-wrapped packets through digital seas, forge connections with unknown cyber-entities, and generally, we believe that we are safe in cyberspace. But as with any uncharted domain, there be dragons...
Hackers are the dragons of the digital age. But there can be good dragons, and there can be evil dragons. Maintaining the balance, through the recruitment and training of "good" hackers, is a hard challenge facing the cybersecurity community. This talk will briefly explore the current paths for laypersons to become hackers, discuss developments and resources in that area, and then move on to a look at the future: replacing us all with very clever AI algorithms, and the new, uncharted frontier of fully-autonomous hacking systems. But as with any uncharted frontier, there be dragons...
Bio: Yan Shoshitaishvili (RPI class of 2006) is an assistant professor at Arizona State University, where he leads research into automated program analysis and vulnerability identification techniques. As part of this, Yan led Shellphish's participation in the DARPA Cyber Grand Challenge, applying his research to the creation of a fully autonomous hacking system that will one day destroy us all. Underpinning this system is angr, an open-source binary analysis project founded by Yan (and others!) and carefully pushed toward sentience over the years. When he is not doing academic research, Yan is active in the competitive hacking community. In the past, he was the captain of Shellphish, one of the oldest competitve hacking groups in the world. Recently, he has founded the Order of the Overflow, whose iron fist controls DEF CON CTF, one of the "world championship" events in cybersecurity competitions. As part of this, Yan regularly deals with the question of how to get interested people into cybersecurity, and turn them into good hackers.
Predicting the outcome of future events is a fundamental problem in fields as varied as computer science, finance, political science and others. To this end, the Wisdom of Crowds principle says that an aggregate of many crowdsourced predictions can significantly outperform even expert individuals or models. In this talk, I will focus on the problem of accurately eliciting these predictions using wagering mechanisms, where participants provide both a probabilistic prediction of the event in question, and a monetary wager that they are prepared to stake.
I will discuss some recent progress on the design and analysis of wagering mechanisms. In particular, I will focus on two surprising applications of wagering mechanisms. First, they can be used to select a winner in a winner-takes-all forecasting competition, which turns out to be quite different to the standard setting where we can reward everyone. Second, I'll show that any wagering mechanism defines a corresponding allocation mechanism for dividing scarce resources among competing agents, a seemingly unrelated problem. This correspondence immediately leads to advances in both areas.
Bio: Rupert Freeman is a postdoc at Microsoft Research New York City. Previously, he received his Ph.D. from Duke University under the supervision of Vincent Conitzer. His research focuses on the intersection of artificial intelligence and economics, particularly in topics such as resource allocation, voting, and information elicitation. He is the recipient of a Facebook Ph.D. Fellowship and a Duke Computer Science outstanding dissertation award.
Refreshments will be served at 3:30pm.
In standard auction theory, bidders' values are private, payments are linear, and the auctioneer seeks to maximize some objective, such as revenue (i.e., total payments). Myerson derived an optimal auction for precisely this setting. Perhaps surprisingly, Myerson's payments also arise as the unique Bayes-Nash equilibrium of an all-pay auction with private values and linear payments. Further, it has been observed that this all-pay auction model naturally describes a contest, where all contestants pay with their time and effort, even if they walk away empty-handed. Building on this analogy, we study an optimal contest design problem where contestants' abilities are private, their costs are a convex function of their effort (think: 80/20 rule), and the designer seeks to maximize submission quality. We show that a very simple all-pay contest where the prize is distributed equally among the top quartile of contestants is always a constant factor approximation to the optimal for a large class of convex cost functions. This result matches the intuition of teachers who do not award only a single A to the very best student in the class; doing so would not maximize the overall quality of student work, because many students would be discouraged from even trying.
Brief Bio: Dr. Amy Greenwald is Professor of Computer Science at Brown University in Providence, Rhode Island. She studies game-theoretic and economic interactions among computational agents, applied to areas like autonomous bidding in wireless spectrum auctions and ad exchanges. In 2011, she was named a Fulbright Scholar to the Netherlands (though she declined the award). She was awarded a Sloan Fellowship in 2006; she was nominated for the 2002 Presidential Early Career Award for Scientists and Engineers (PECASE); and she was named one of the Computing Research Association's Digital Government Fellows in 2001. Before joining the faculty at Brown, Dr. Greenwald was employed by IBM's T.J. Watson Research Center. Her paper entitled "Shopbots and Pricebots" (joint work with Jeff Kephart) was named Best Paper at IBM Research in 2000.
Refreshments at 8:45 am
Demand for resources that are collectively controlled or regulated by society, like social services or organs for transplantation, typically far outstrips supply. How should these scarce resources be allocated? Any approach to this question requires insights from computer science, economics, and beyond; we must define objectives, predict outcomes, and optimize allocations, while carefully considering agent preferences and incentives. In this talk, I will discuss our work on weighted matching and assignment in two domains, namely living donor kidney transplantation and provision of services to homeless households. My focus will be on how effective prediction of the outcomes of matches has the potential to dramatically improve social welfare both by allowing for richer mechanisms and by improving allocations. I will also discuss implications for equity and justice.
This talk is based on joint work with Zhuoshu Li, Amanda Kube, Sofia Carrillo, Patrick Fowler, and Jason Wellen.
Sanmay Das is an associate professor in the Computer Science and Engineering Department at Washington University in St. Louis. Prior to joining Washington University, he was on the faculty at Virginia Tech and at Rensselaer Polytechnic Institute. Das received Ph.D. and S.M. degrees from the Massachusetts Institute of Technology, and an A.B. from Harvard University, all in Computer Science. Das' research is in artificial intelligence and machine learning, and especially their intersection with finance, economics, and the social sciences. He has received an NSF CAREER Award and the Department Chair Award for Outstanding Teaching at Washington University. Das has served as program co-chair of AAMAS and AMMA, and regularly serves on the senior program #committees of major AI conferences like AAAI and IJCAI.
Although each individual thinks that the decision to form a link in a social network is an autonomous, local decision, we know that this isn’t so. Larger forces are involved: human mental limitations (Dunbar’s Number), social balance that induces triangle closure and resists certain signed triads (Georg Simmel), and the flow of properties ‘over the horizon’ within networks (Christakis). Social network analysis has three phases: aggregating individual link decisions, understanding the global structure that results, and revisiting each link in the context of this global structure; and both the second and third phases provide insights.
The relationships associated with links are much richer than most social network analysis recognises: relationships can be of qualitatively different types (friend, colleague), directed (so the intensity one way is different from the intensity the other), signed (both positive and negative, friend or enemy), and dynamic (changing with time). All of these extra properties, and their combinations, can be modelled using a single scheme, creating an expanded social network, which can then be analysed in conventional ways (for example, by spectral embedding).
Social network analysis is especially useful in adversarial settings (where the interests of those doing the modelling and [some of] those being modelled are not aligned) because each individual cannot control much of the global structure. I will illustrate how this pays off in law enforcement and counterterrorism settings.
The internet and modern technology enables us to communicate and interact at lightening speed and across vast distances. The communities and markets created by this technology must make collective decisions, allocate scarce resources, and understand each other quickly, efficiently, and often in the presence of noisy communication channels, ever changing environments, and/or adversarial data. Many theoretical results in these areas are grounded on worst case assumptions about agent behavior or the availability of resources. Transitioning theoretical results into practice requires data driven analysis and experiment as well as novel theory with assumptions based on real world data. I'll discuss recent work that focus on creating novel algorithms for including a novel, strategyproof mechanism for selecting a small subset of winners amongst a group of peers and algorithms for resource allocation with applications ranging from reviewer matching to deceased organ allocation. These projects require novel algorithms and leverage data to perform detailed experiments as well as creating open source tools.
Nicholas Mattei is a Research Staff Member in the Cognitive Computing Group the IBM TJ Watson Research Laboratory. His research is in artificial intelligence (AI) and its applications; largely motivated by problems that require a blend of techniques to develop systems and algorithms that support decision making for autonomous agents and/or humans. Most of his projects and leverage theory, data, and experiment to create novel algorithms, mechanisms, and systems that enable and support individual and group decision making. He is the founder and maintainer of PrefLib: A Library for Preferences; the associated PrefLib:Tools available on Github; and is the founder/co-chair for the Exploring Beyond the Worst Case in Computational Social Choice (2014 - 2017) held at AAMAS.
Nicholas was formerly a senior researcher working with Prof. Toby Walsh in the AI & Algorithmic Decision Theory Group at Data61 (formerly known as the Optimisation Group at NICTA). He was/is also an adjunct lecturer in the School of Computer Science and Engineering (CSE) and member of the Algorithms Group at the University of New South Wales. He previously worked as a programmer and embedded electronics designer for nano-satellites at NASA Ames Research Center. He received his Ph.D from the University of Kentucky under the supervision of Prof. Judy Goldsmith in 2012.
Refreshments served at 3:30 p.m.
Reception at 3:30pm
Emerging real-world graph problems include: detecting community structure in large social networks; improving the resilience of the electric power grid; and detecting and preventing disease in human populations. Unlike traditional applications in computational science and engineering, solving these problems at scale often raises new challenges because of the sparsity and lack of locality in the data, the need for additional research on scalable algorithms and development of frameworks for solving these problems on high performance computers, and the need for improved models that also capture the noise and bias inherent in the torrential data streams. In this talk, the speaker will discuss the opportunities and challenges in massive data-intensive computing for applications in computational science and engineering.
David A. Bader is Professor and Chair of the School of Computational Science and Engineering, College of Computing, at Georgia Institute of Technology. He is a Fellow of the IEEE and AAAS and served on the White House's National Strategic Computing Initiative (NSCI) panel. Dr. Bader served as a board member of the Computing Research Association, on the NSF Advisory Committee on Cyberinfrastructure, on the Council on Competitiveness High Performance Computing Advisory Committee, on the IEEE Computer Society Board of Governors, on the Steering Committees of the IPDPS and HiPC conferences, and as editor-in-chief of IEEETransactions on Parallel and Distributed Systems, and is a National Science Foundation CAREER Award recipient. Dr. Bader is a leading expert in data sciences. His interests are at the intersection of high-performance computing and real-world applications, including cybersecurity, massive-scale analytics, and computationalgenomics, and he has co-authored over 210 articles in peer-reviewed journals and conferences. During his career, Dr. Bader has served as PI/coPI of over $180M of competitive awards. Dr. Bader has served as a lead scientist in several DARPA programs including High Productivity Computing Systems (HPCS) with IBM PERCS, Ubiquitous High Performance Computing (UHPC) with NVIDIA ECHELON, Anomaly Detection at Multiple Scales (ADAMS), Power Efficiency Revolution For Embedded Computing Technologies (PERFECT), and Hierarchical Identify Verify Exploit (HIVE). He has also served as Director of the Sony-Toshiba-IBM Center of Competence for the Cell Broadband Engine Processor. Bader is a co-founder of the Graph500 List for benchmarking "Big Data" computing platforms. Bader is recognized as a "RockStar" of High Performance Computing by InsideHPC and as HPCwire's People to Watch in 2012 and 2014. Dr. Bader also serves as an associate editor for several high impact publications including IEEE Transactions on Computers, ACM Transactions on Parallel Computing, and ACM Journal of Experimental Algorithmics. He successfully launched his school's Strategic Partnership Program in 2015, whose partners include Accenture, Booz Allen Hamilton, Cray, IBM, Keysight Technologies, LexisNexis, Northrop Grumman, NVIDIA, and Yahoo; as well as the National Security Agency, Sandia National Laboratories, Pacific Northwest National Laboratory, and Oak Ridge National Laboratory.
In over 200 years since Ada Lovelace's birth, she has been celebrated, neglected, and taken up as a symbol for any number of causes and ideas. This talk traces some paths the idea of Lovelace and her imagination of Charles Babbage's Analytical Engine has taken. In particular, we focus on music and creativity, after Lovelace's idea that 'the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent'.
This work began at a symposium in 2015 to mark the 200th anniversary of Lovelace's birth. Through collaborations with Pip Willcox (head of the Centre for Digital Scholarship at the Bodleian Libraries), composer Emily Howard (Royal Northern College of Music), and mathematician Ursula Martin, we have conducted a series of experiments and demonstrations inspired by the work of Lovelace and Babbage. These include simulations of the Analytical Engine, use of a web-based music application, construction of hardware, reproduction of earlier mathematical results using contemporary computational methods, and a musical performance based on crowd sourced algorithmic fragments.
Recently we have introduced Charles Wheatstone to our experiments: Wheatstone deployed an electric telegraph in the same period, but could an electromechanical computer have been constructed? These digital experiments bring insight and engagement with historical scenarios. Our designed digital artefacts can be viewed as design fictions, or as critical works explicating our interpretation of Lovelace’s words: digital prototyping as a means of close-reading. We frame this as Experimental Humanities.
David De Roure is a Professor of Computer Science in the School of Electronics and Computer Science at the University of Southampton, UK. A founding member of the School's Intelligence, Agents, Multimedia Group, he leads the e-Research activities and is Director of the Pervasive Systems Centre. David's work focuses on creating new research methods in and between multiple disciplines, particularly through the codesign of new tools. His projects draw on Web 2.0, Semantic Web, workflow and scripting technologies; he pioneered the Semantic Grid initiative and is an advocate of Science 2.0. He is closely involved in UK e-Science and e-Social Science programmes in projects including myExperiment, CombeChem, myGrid, LifeGuide, e-Research South and the Open Middleware Infrastructure Institute. David has worked for many years with distributed information systems and distributed programming languages, with leading roles in the Web, hypertext and Grid communities. He is a Scientific Advisory Council member of the Web Science Research Initiative and a Fellow of the British Computer Society.
The control network design consists mainly of two steps: input/output (I/O) selection and control configuration (CC) selection. The first one is devoted to the problem of computing how many actuators/sensors are needed and where should be placed in the plant to obtain some desired property. Control configuration is related to the decentralized control problem and is dedicated to the task of selecting which outputs (sensors) should be available for feedback and to which inputs (actuators) in order to achieve a predefined goal. The choice of inputs and outputs affects the performance, complexity and costs of the control system. Due to the combinatorial nature of the selection problem, an efficient and systematic method is required to complement the designer intuition, experience and physical insight.
Motivated by the above, this presentation addresses the structure control system design taking explicitly into consideration the possible application to large - scale systems. We provide an efficient framework to solve the following major minimization problems: i) selection of the minimum number of manipulated/measured variables to achieve structural controllability/observability of the system, and ii) selection of the minimum number of measured and manipulated variables, and feedback interconnections between them such that the system has no structural fixed modes. Contrary to what would be expected, we showed that it is possible to obtain the global solution of the aforementioned minimization problems in polynomial complexity in the number of the state variables of the system. To this effect, we propose a methodology that is efficient (polynomial complexity) and unified in the sense that it solves simultaneously the I/O and the CC selection problems. This is done by exploiting the implications of the I/O selection in the solution to the CC problem.
This talk is based on the newly-released biography of Claude Shannon—one of the foremost intellects of the twentieth century and the architect of the Information Age, whose insights stand behind every computer built, email sent, video streamed, and webpage loaded. Claude Shannon was a groundbreaking polymath, a brilliant tinkerer, and a digital pioneer. He constructed the first wearable computer, outfoxed Vegas casinos, and built juggling robots. He also wrote the seminal text of the digital revolution, which has been called “the Magna Carta of the Information Age.”
Jimmy Soni has served as an editor at The New York Observer and the Washington Examiner and as managing editor of Huffington Post. He is a former speechwriter, and his written work and commentary have appeared in Slate, The Atlantic, and CNN, among other outlets. He is a graduate of Duke University. With Rob Goodman, he is the award-winning coauthor of Rome’s Last Citizen: The Life and Legacy of Cato, Mortal Enemy of Caesar, and A Mind at Play: How Claude Shannon Invented the Information Age. A Mind at Play was awarded the 2017 Neumann Prize from the British Society for the History of Mathematics for the book that best explains mathematical history for a general audience.
The phenomenon of spreading is very broad: it can mean spreading of electromagnetic waves, wind-blown seeds, diseases or information. Spreading can also happen in various environments, from simple spatial (like waves on water) to complicated networks (like information in society). Quite often, it is evident there is a spread, but it is not directly known where it came from. In case of physical phenomena, that feature constant-velocity spreading in space, it may require simple triangulation to pinpoint the source. But if the process happens in a complex network and it also has nature so complex as to be only possible to describe in a stochastic manner (such as epidemics or information spreading) then finding a source becomes a more complicated problem.
Both epidemics and information spreading share a common property - there is no total mass, energy, count, or volume conserved, as is for example in spreading waves (energy) or diffusing substances (mass/volume). Because of that, both can be modeled in similar way, for example by a basic SI (Susceptible-Infected) model. The presentation will describe some existing methods to find sources of spread in such processes and focus on a method based on maximum likelihood introduced by Pinto et.al. as well as describe derivative methods that feature much better performance, as well as improvements in accuracy.
Assume we have a network, where information or epidemic has spread, and we only know when it arrived in some specific points in the network (which we call observers). Further assumptions are that spreading always happens along shortest paths from source to any node, and that the delays on links can be approximated by normal distribution. It is then possible for each potential source in the network (every node in general) to calculate expected arrival times as well as the variances and covariance of these times. For each node we therefore have a multivariate normal distribution of arrival times at all observers. Comparing the distributions with actual observation, we can find the source that has distribution best fitting observed arrival times.
One of drawbacks of such method is that for large number N of nodes in network, the computational costs gets prohibitive, with up to O(N4) complexity. We have proposed a derivative method that limits the considered observers, as well as smart choice of suspected nodes. We only take into account ~ observers that are closest to real source (have earliest observed time of infection), greatly decreasing computational complexity. Instead of calculating distribution for every node, we start with closest observer and follow a gradient of increasing likelihood. These changes not only greatly increase performance (complexity at worst O(N2log(N)) but also increase accuracy in scale-free network topologies.
Krzysztof Suchecki is an assistant professor at Warsaw University of Technology, Faculty of Physics. He has MSc and PhD degrees in physics. His research topics focus on dynamics of and on complex networks, such as Ising and voter model dynamics, including co-evolution of network structure and dynamical node states. His current research is focused on spreading of information in networks and methods to identify sources of information.
enabled this expansion, and how an increasing Community of researchers have made it possible.
Julian Dolby is a Research Staff Member at the IBM Thomas J. Watson Research Center, where he works on a range of topics, including static program analysis, software testing, concurrent programming models and the semantic Web. He is one of the primary authors of the publically available Watson Libraries for Analysis (WALA) program analysis infrastructure, and his recent WALA work has focused on creating the WALA Mobile infrastructure.
Complex systems typically display a modular structure, as modules are easier to assemble than the individual units of the system, and more resilient to failures. In the network representation of complex systems, modules, or communities, appear as subgraphs whose nodes have an appreciably larger probability to get connected to each other than to other nodes of the network. In this talk I will address three fundamental questions: How is community structure generated? How to detect it? How to test the performance of community detection algorithms? I will show that communities emerge naturally in growing network models favoring triadic closure, a mechanism necessary to implement for the generation of large classes of systems, like e.g. social networks. I will discuss the limits of the most popular class of clustering algorithms, those based on the optimization of a global quality function, like modularity maximization. Testing algorithms is probably the single most important issue of network community detection, as it implicitly involves the concept of community, which is still controversial. I will discuss the importance of using realistic benchmark graphs with built-in community structure, as well as the role of metadata.
The next generation of complex engineered systems will be endowed with sensors and computing capabilities that enable new design concepts and new modes of decision-making. For example, new sensing capabilities on aircraft will be exploited to assimilate data on system state, make inferences about system health, and issue predictions on future vehicle behavior---with quantified uncertainties---to support critical operational decisions. However, data alone is not sufficient to support this kind of decision-making; our approaches must exploit the synergies of physics-based predictive modeling and dynamic data. This talk describes our recent work in adaptive and multifidelity methods for optimization under uncertainty of large-scale problems in engineering design. We combine traditional projection-based model reduction methods with machine learning methods, to create data-driven adaptive reduced models. We develop multifidelity formulations to exploit a rich set of information sources, using cheap approximate models as much as possible while maintaining the quality of higher-fidelity estimates and associated guarantees of convergence."
Karen E. Willcox is Professor of Aeronautics and Astronautics at the Massachusetts Institute of Technology. She is also Co-Director of the MIT Center for Computational Engineering and formerly the Associate Head of the MIT Department of Aeronautics and Astronautics. Before joining the faculty at MIT, she worked at Boeing Phantom Works with the Blended-Wing-Body aircraft design group. Willcox is currently Co-director of the Department of Energy DiaMonD Multifaceted Mathematics Capability Center on Mathematics at the Interfaces of Data, Models, and Decisions, and she leads an Air Force MURI on optimal design of multi-physics systems. She is also active in education innovation, serving as co-Chair of the MIT Online Education Policy Initiative, co-Chair of the 2013-2014 Institute‑wide Task Force on the Future of MIT Education, and lead of the Fly-by-Wire project developing blended learning technology as part of the Department of Education's First in the World program.
A central theme in mechanism design is understanding the tradeoff between simplicity and optimality of the designed mechanism. An important and challenging task here is to design simple multi-item mechanisms that can approximately optimize the revenue, as the revenue-optimal mechanisms are known to be extremely complex and thus hard to implement. Recently, we have witnessed several breakthroughs on this front obtaining simple and approximately optimal mechanisms when the buyers have unit-demand (Chawla et. al. '10) or additive (Yao '15) valuations. Although these two settings are relatively similar, the techniques employed in these results are completely different and seemed difficult to extend to more general settings. In this talk, I will present a principled approach to design simple and approximately optimal mechanisms based on duality theory. Our approach unifies and improves both of the aforementioned results, and extends these simple mechanisms to broader settings, i.e. multiple buyers with XOS valuations.
Based on joint work with Nikhil R. Devanur, Matt Weinberg and Mingfei Zhao.
Yang Cai is a William Dawson Assistant Professor of Computer Science at McGill University. He received his PhD in computer science from MIT, advised by Costis Daskalakis. He was a postdoctoral researcher in UC Berkeley. Yang’s research interests lie in the area of theoretical computer science, in particular algorithmic game theory, online algorithms and logic.
Abstract: Although the concept of ransomware is not new (i.e., such attacks date back at least as far as the 1980's), this type of malware has recently experienced a resurgence in popularity. In fact, in 2014 and 2015, a number of high-profile ransomware attacks were reported, such as the large-scale attack against Sony that prompted the company to delay the release of the film, "The Interview". Ransomware typically operates by locking the desktop of the victim to render the system inaccessible to the user, or by encrypting, overwriting, or deleting the user's files. However, while many generic malware detection systems have been proposed, none of these systems have attempted to specifically address the ransomware detection problem. In this keynote, I talk about some of the trends we are seeing in ransomware. Then, I present a novel dynamic analysis system called UNVEIL that is specifically designed to detect ransomware. The key insight of the analysis is that in order to mount a successful attack, ransomware must tamper with a user's files or desktop. UNVEIL automatically generates an artificial user environment, and detects when ransomware interacts with user data. In parallel, the approach tracks changes to the system's desktop that indicate ransomware-like behavior. Our evaluation shows that UNVEIL significantly improves the state of the art, and is able to identify previously unknown evasive ransomware that was not detected in the anti-malware industry.
Bio: Engin Kirda holds the post of professor of computer science at Northeastern University in Boston. Before that, he held faculty positions at Institute Eurecom in the French Riviera and the Technical University of Vienna, where he co-founded the Secure Systems Lab that is now distributed over five institutions in Europe and the United States. Professor Kirda's research has focused on malware analysis (e.g., Anubis, Exposure, and Fire) and detection, web application security, and practical aspects of social networking security. He co-authored more than 100 peer-reviewed scholarly publications and served on the program #committees of numerous well-known international conferences and workshops. Professor Kirda was the program chair of the International Symposium on Recent Advances in Intrusion Detection (RAID) in 2009, the program chair of the European Workshop on Systems Security (Eurosec) in 2010 and 2011, the program chair of the well-known USENIX Workshop on Large Scale Exploits and Emergent Threats in 2012, and the program chair of security flagship conference Network and Distributed System Security Symposium (NDSS) in 2015. In the past, Professor Kirda has consulted the European Commission on emerging threats, and recently gave a Congressional Breifing in Washington, D.C. on advanced malware attacks and cyber-security. He also spoke at SXSW Interactive 2015 about "Malware in the Wild" and at Blackhat 2015. Besides his roles at Northeastern, Professor Kirda is a co-founder of Lastline Inc., a Silicon-Valley based company that specializes in the detection and prevention of advanced targeted malware.
Abstract: Driven by modern applications and the abundance of empirical network data, network science aims at analysing real-world complex systems arising in the social, biological and physical sciences by abstracting them into networks (or graphs). The size and complexity of real networks has produced a deep change in the way that graphs are approached, and led to the development of new mathematical and computational tools. In this talk, I will present a data-driven network-based methodology to approach data analysis problems arising in a variety of contexts while highlighting state of the art network models including spacial, multilayer, interdependant and modular networks. I will describe different stages in the analysis of data starting from (1) network representations of urban transportation systems, then (2) inference of structural patterns in networks and phase transitions in their detectability, and finally (3) understanding the implications of structural features to dynamical processes taking place on networks. I will conclude with a discussion of the future directions of my work and its intersection with complexity theory, machine learning and statistics.
Bio: I received my Bachelors of Science degree, cum laude, in Mathematics and Computer Science (double degree) from the Israel Institute of Technology (Technion) in 2008. I then shifted my focus towards acquiring Industrial experience and worked as a Software Engineer at Diligent Technologies, an Israel startup that was acquired by IBM. I completed my PhD in Computer Science at the University of St Andrews, UK under the guidance of Simon Dobson in 2014. During my PhD I was supported bya full prize scholarship from the Scottish Informatics and COmputer Science Alliance (SICSA). In 2014 I joined Peter Mucha's group in the Department of Mathematics at the University of North Carolina at Chapel Hill as a Postdoctoral Scholar. My research has been focused on the development of mathematical and computational tools for modeling and analyzing complex systems, roughly defined as large networks of simple components with no central control that give rise emergent complex behavior. I believe that looking at data through a "network lens" provides us with useful perspectives on diverse problems, from designing optimal transportation systems and smart cities to clustering high-demensional data. I am constantly looking for new datasets on which I can apply my "network machinery" to solve real-world problems and to inspire the development of new methodologies.
Abstract: Thanks to Information Technologies, online user behaviors are broadly recorded at an unprecedented level. This gives us an opportunity for getting insights into human behaviors and our societies from real data of large scale enough that makes manual analysis and inspection completely impractical in building intelligent and trustworthy systems. In this talk I will discuss about research problems, challenges, priciples, and methodologies of developing network-based computational models for behavioral analysis. Specifically, I will present recent approaches on (1) modeling user behavior intentions with knowledge from social and behavioral sciences for behavior prediction, recommendation, and suspicious behavioral detection, (2) modeling social and spatiotemporal information for knowledge from behavioral contexts, (3) Structuring behavioral content into information networks of entities and attributes, and (4) integrating structured and unstructured behavior data to support decision-making and information systems. Two results, CatchSync and MetaPAD, will be presented in details. I will conclude with my thoughts on future directions.
Bio: Dr. Meng Jiang is a postdoctoral researcher in University of Illinois at Urbana-Champaign. He received his Ph.D. from the Department of Computer Science at Tsinghua University in 2015. He obtained his bachelor degree from the same department in 2010. He visited Carnegie Mellon University in 2013 and vistited University of Maryland, College Park in 2016. Find more about him here: http://www.meng-jiang.com.
His research lies in the field of data mining, focusing on user behavior modeling. He has delivered two book chapters and two conference tutorials on this topic. His Ph.D. thesis won the Dissertation Award at Tsinghua. His work on "Suspicious Behavior Detection" was selected as one of the Best Paper Finalists in KDD '14. His work on "Social Contextual Recommendation" has been deployed in the Tencent social network. The package of his work on "Attribute Discovery from Text Corpora" is transferring to U.S. Army Research Lab. He has published 20 papers with 560+ citations, including 15 papers with 360+ citations as the first author.
Abstract: Complex systems exist in almost every aspect of science & technology. My research question focuses on how to understand, predict, control, and ultimately survive real-world complex systems in an ever-changing world facing the global challenges of climate change, weather extremes, and other natural and human distasters. I will present three recent works in the field of network science and complex systems: resilience, robustness, and control. (I) Resilience, a system's ability to adjust its activity to retain its basic functionality when errors and environmental changes occur, is a defining property of many complex systems. I will show a set of analytical tools with which to identify the natural control and state parameters of a multi-dementional complex system, helping us derive an effective one-dimensional dynamics that accurately predicts the system's resilience. The analytical results unveil the network characteristics that can enhance or diminish resilience, offering ways to prevent the collapse of ecological, biological, or economic systems, and guiding the design of technological systems resilient to both internal failures and environmental changes. (II) Increasing evidence shows that real-world systems interact with one another, and the real goal in network science shouldn't just understand individual networks, but deciphering the dynamical interactions in networks of networks (NONs). Malfunction of a few nodes in one network layer can cause cascading failures and catastrophic collapse of the entire system. I will show the general theoretical framework for analyzing the robustness of and cascading failures in NONs. The results of NONs have been surprisingly rich, and they differ from those of single networks that they present a new paradigm. (III) Controlling complex networks is the ultimate goal of understanding the dynamics of them. I will present a k-walk theory and greedy algorithm for target control of complex networks. Extending from the three aspects of current research, I will describe the two future research directions: universality of competitive dynamics, and control and recovery of damaged complex systems.
Bio: Dr. Jianxi Gao is a research assistant professor in the Center for Complex Network Research at Northeastern University. Dr. Gao received his Ph.D. degree at Shanghai Jiao Tong University from 2008 - 2012. During his Ph.D. he was a visiting scholar at Prof. H. Eugene Stanley's lab at Boston University from 2009 - 2012. Dr. Gao's major contribution includes the theory for robustness of networks of networks and resilience of complex networks. Since 2010, Dr. Gao has published over 20 journal papers in Nature, Nature Physics, Nature COmmunications, Proceedings of the National Academy of Sciences, Physical Review Letters and more, with over 15 hundred citations on Google SCholar. Dr. Gao has been selected as the Editor board of Nature Scientific Reports, distinguished referee of EPL (2014-2016), and referee of Science, PNAS, PRL, PRX and more. His publications were reported over 20 times by international public and professional media.
Abstract: Whether the people we follow on Twitter can be recommended as our potential friends on Facebook? How is the box office that US movies can acheive in China? How do weather and nearby points-of-interest (POIs) affect the traffic routes planned for vehicles? About the same information entities, a large amount of information can be collected from various sources, each of which provides a specific signature of the entity from a unique underlying aspect. Effective fusion of these different information sources provides an opportunity for understanding the information entities more comprehensively.
My thesis works investigate the priciples, methodologies and algorithms for knowledge discovery across multiple aligned information sources, and evaluate the corresponding benefits. Fusing and mining multiple information sources of large Volumes and diverse Varieties is a fundamental problem in Big Data studies. In this talk, I will discuss about the information fusion and synergistic knowledge discovery works, focusing on online social media, and present my algorithmic works on multi-source learning frameworks together with the evaluation results. I will also provide my future vision of fusion learning for broader real-world applications at the conclusion of the talk.
Bio: Jiawei Zhang is a Ph.D. candidate at the Department of Computer Science at University of Illinois at Chicago (UIC), under the supervision of Prof. Philip S. Yu since August 2012. Prior to joining UIC, he obtained his Bashelor's Degree in Computer Science from Nanjing University, in China. His research interests span the fields of Data Science, Data Mining, Network Mining, and Machine Learning. His research works focus on fusing multiple large-scale information sources of diverse varieties together, and carrying out synergistic data mining tasks across these fused sources in one unified alalytic.
His fusion learning works have appeared in KDD, ICDM, SDM, ECLM/PKDD, IJCAI, WWW, WSDM, CIKM, IEEE Transactions on Knowledge and Data Engineering (TKDE). He receives the Best Student Paper Runner Up Award from ASONAM' 16. He has been serving as the information specialist and director of ACM Transaction on KNowledge Discovery Data (TKDD) since August 2014. His is also the PC member of WWW' 17, KDD' 16, CIKM' 16, CIKM' 15, and AIRS' 16. Besides the academic experience at University Illinois Chicago, he also has industrial research experiences working at Microsoft Research in 2014, IBM T.J. Watson Research Center in 2015.
The prediction and control of the dynamics of networked systems is one of the central problems in network science. Structural and dynamical similarities of different real networks suggest that some universal laws might accurately describe the dynamics of these networks, though the nature and common origins of such laws remain elusive. Do these universal laws exist? We do not have the answer to this question...yet. I will talk about the latent geometry approach to network systems, which, in my opinion, could be a first step toward the formulation of universal laws of network dynamics. In this approach, networks underlying complex systems are viewed as discretizations of smooth geometric splaces. Network nodes are points in these spaces and the probibility of a connection between them. I will start my talk with a motivation and a high level introduction to the latent geometry concept. I will continue with a (semi) rigorous discussion of the mathematics underlying the approach and computational algorithms for uncovering latent geometries of real systems. I will conclude my talk by describing existing and prospective applications of the latent geometry, including Internet interdomain routing, large-scale dynamics of networked systems, human diseases and social dynamics.
Bio: Dr. Kitsak is an associate research scientist in the Department of Physics and the Network Science Institute at Northeastern University. Dr. Kitsak earned Ph.D. in theoreticsl physics from Boston University in 2009 under the direction of Prof. H.E. Stanley. Dr. Kitsak has held postdoctoral positions at the Center for Applied Internet Data Analysis (CAIDA), UC San Diego (2009-2012); and the Center for Complex Network Research (CCNR), Northeastern University (2012-2014). His research focuses on the development of theoretical and computational approaches to networked systems.
Abstract: Probabilistic reasoning lets you predict the future, infer past causes of current observations, and learn from experience. It can be hard to implement a probabilistic application because you have to implement the representation, inference, and learning algorithms. Probabilistic programming makes this much easier by providing an expressive language to represent models as well as inference and learning algorithms that automatically apply to models written in the language. In this talk, I will present the past, present, and future of probabilistic programming and our Figaro probabilistic programming system. I will start with the motivation for probabilistic programming and Figaro. After presenting some basic Figaro concepts, I will introduce several applications we have been developing at Charles River Analytics using Figaro. Finally, I will describe our future vision of providing a probabilistic programming tool that domain experts with no machine learning knowledge can use. In particular, I will present a new inference method that is designed to work well on a wide variety of problems with no user configuration. Prior knowledge of machine learning is not required to follow this talk.
Bio: Dr. Avi Pfeffer is Chief Scientist at Charles River Analytics. Dr. Pfeffer is a leading researcher on a variety of computational intelligence techniques including probabilistic reasoning, machine learning, and computational game theory. Dr. Pfeffer has developed numerous innovative probabilistic representation and reasoning frameworks, such as probabilistic programming, which enables the development of probabilistic models using the full power of programming languages, and statistical relational reasoning. He is the lead developer of Charles River Analyics' Figaro probabilistic programming language. As an Associate Professor at Harvard, he developed IBAL, the first general-purpose probabilistic programming language. While at Harvard, he also produced systems for representing, reasoning about, and learning beliefs, preferences, and decision making strategies of people in strategic situations. Prior to joining Harvard, he invented object-oriented Bayesian networks and probabilistic relational models, which form the foundation of the field of statistical relational learning. Dr. Pfeffer serves as Program Chair of Conference on Uncertainty in Artificial Intelligence. He has published many journal and conference articles and is the author of a text on probabilistic programming. Dr. Pfeffer received his Ph.D. in computer science from Stanford University and his B.A. in computer science from the University of California, Berkeley.
With great progress made in the areas of Cognitive Computing and Human-scale Sensory Research, the paradigm of human computer interaction will be a collaboration between human beings and intelligent machines through a human-scale and immersive experience. What research challenges exists today to enable this new paradigm, how it will help human beings (especially groups of people), what transformation it will bring to us, are all important to us. In this talk, Dr. Su is going to introduce the vision of future Cognitive and Immersive Situations Room - an immersive, interactive, intelligent physical environment, and the research agenda to realize the vision.
Bio: Dr. Hui Su is the Director of Cognitive and Immersive Systems Lab, a collaboration between IBM Research and Rensselaer Polytechnic Institute. He has been a technical leader and an executive at IBM Research. Most recently, he was the Director of IBM Cambridge Research Lab in Cambridge, MA and was responsible for a broad scope of global missions in IBM Research, including Cognitive User Experience, Center for Innovation in Visual Analytics and Center for Social Business. As a technical leader and a researcher for 20 years at IBM Research, Dr. Su has been an expert in multiple areas ranging from Human Computer Interaction, Cloud Computing, Visual Analytics, and Neural Network Algorithms for Image Recognition etc. As an executive, he has been leading research labs and research teams in the U.S. and China.
Dimensionality reduction methods have been used to represent words with vectors in NLP applications since at least the introduction of latent semantic indexing in the late 1980s, but word embeddings developed in the past several years have exhibited a robust ability to map semantics in a surprisingly straightforward manner onto simple linear algebraic operations. These embeddings are trained on cooccurrence statistics and intuitively justified by appealing to the distributional hypothesis of Harris and Firth, but are typically presented in an ad-hoc algorithmic manner. We consider the canonical skip-gram Word2vec embedding, arguably the most well-known of these recent word embeddings, and establish a corresponding generative model that maps the composition of words onto the addition of their embeddings. By virtue of, first, the fact that words can be replaced more generally with arbitrary symbols and, second, natural connections between Word2vec, the classical RC(M) association model for contingency tables, Bayesian PCA decompositions, Poisson matrix completion, and the Sufficient Dimensionality Reduction model of Globerson and Tishby, our results are meaningful in a broad context.