Designed and built with care, filled with creative elements

Top

ENS-Data Science colloquium – Noah A. Smith (University of Washington)

Salle Jaurès 29 rue d’Ulm

Noah A. Smith (University of Washington) Breaking Down Language Models “Language models are the only thing we have in natural language processing that could be considered scientific.” A collaborator of mine said this more than a decade ago, long before LMs emerged as the single most important technology to come out of our field. In these exciting times, I seek both to make the study of LMs more scientific, and to make LMs more practically beneficial. In this talk, I’ll first draw from recent work from my UW group that starts […]

ENS-Data Science colloquium – Antoine Georges

Amphi Jaurès (29 Rue d'Ulm)

21 Mars 2024, Antoine Georges (Collège de France, Paris and Flatiron Institute, New York) Title: Applications of Machine Learning and Neural Networks to Quantum Systems Abstract: Applications of learning algorithms using deep neural networks have developed considerably recently, often with spectacular results. The physics of complex quantum systems is no exception, with multiple applications that constitute a new field of research. Examples include the representation and optimization of wave functions of quantum systems with large numbers of degrees of freedom (neural quantum states), the determination of wave functions from measurements (quantum tomography), and applications […]

ENS-Data Science colloquium – Lénaïc Chizat (EPFL)

Amphi Jaurès (29 Rue d'Ulm)

04 Avril 2024, Lénaïc Chizat (EPFL) Title: A Formula for Feature Learning in Large Neural Networks Abstract: Deep learning succeeds by doing hierarchical feature learning, but tuning hyperparameters such as initialization scales, learning rates, etc., only give indirect control over this behavior. This calls for theoretical tools to predict, measure and control feature learning. In this talk, we will first review various theoretical advances (signal propagation, infinite width dynamics, etc) that have led to a better understanding of the subtle impact of hyperparameters and architectural choices on the training dynamics. We will then introduce […]

ENS-Data Science colloquium – Michael Jordan

Amphi Jaurès (29 Rue d'Ulm)

Michael Jordan (UC Berkeley and INRIA Paris) Collaborative Learning, Information Asymmetries, and Incentives This colloquium is organized around data sciences in a broad sense, with the goal of bringing together researchers with diverse backgrounds (including mathematics, computer science, physics, chemistry and neuroscience) but a common interest in dealing with complex, large scale, or high dimensional data. More information can be found on the web page of the seminar: https://data-ens.github.io/seminar/

ENS-Data Science colloquium – Luca Biferale

Salle conf IV

Luca Biferale (Università degli Studi di Roma Tor Vergata) Title:Data driven tools for Lagrangian TurbulenceAbstract: We present a stochastic method for generating and reconstructing complex signals along the trajectories of small objects passively advected by turbulent flows . Our approach makes use of generative Diffusion Models, a recently proposed data-driven machine learning technique. We show applications to 3D tracers and inertial particles in highly turbulent flows, 2D trajectories from NOAA’s Global Drifter Program and dynamics of charged particles in astrophysics. Supremacy against linear decomposition and Gaussian Regression Processes is analyzed in terms […]

ENS-Data Science colloquium – Jean-Rémi King

ENS Salle Dussane

Jean-Rémi King (CNRS, ENS & Meta AI) Title:AI and Neuroscience: in search of the laws of intelligenceAbstract: In just a few years, AI has transitioned from a specialized field into a transformative force for industries and society. Beyond this technical progress, the development of AI provides a new paradigm to understand the intricate workings of the human brain. To illustrate this, we will delve into a series of experiments that systematically compare deep learning algorithms with the human brain in response to images, sounds, and texts. These comparisons consistently show a partial […]

ENS-Data Science colloquium – Michele Ceriotti (EPFL)

ENS Salle Dussane

Michele Ceriotti (EPFL) Title: Between physics and scaling: inductive biases in atomistic machine learningAbstract: Machine-learning techniques are often applied to perform "end-to-end" predictions, making black-box estimatesof a property of interest using only a coarse description of the corresponding inputs.In contrast, atomic-scale modeling of matter is most useful when it allows one to gather a mechanistic insightinto the microscopic processes that underlie the behavior of molecules and materials.In this talk I will provide an overview of the progress that has been made combining these two philosophies,using data-driven techniques to build surrogate models […]

Eva Dyer (University of Pennsylvania): large-scale pretraining on neural data allows for transfer across individuals, tasks and species

Amphi Jaurès (29 Rue d'Ulm)

The brain is incredibly complex, with diverse functions that emerge from the coordinated activity of billions of neurons. These functions vary across brain regions and adapt dynamically as we engage in different tasks, process sensory information, or generate behavior. Yet, each neural recording captures only a small glimpse of this immense complexity, offering a limited view of the broader system. This motivates the need for an algorithmic approach to stitch together diverse datasets, integrating neural activity across brain regions, cell types, and individuals. In this talk, I will present our […]

Étienne Ollion – Machine Bias. How do generative LLMs Answer Opinion Polls?

ENS Salle Dussane

Generative AI is increasingly presented as a potential substitute for humans, including as human research subjects in various disciplines. Yet there is no scientific consensus on how closely these in-silico clones could represent their human counterparts. While some defend the use of these “synthetic users,” others point towards the biases in the responses provided by the LLMs. Through an experiment using survey questionnaires, we demonstrate that these latter critics are right to be wary of using generative AI to emulate respondents, but probably not for the right reason. Our results […]

Nathan Srebro: Learning to Answer from Correct Demonstrations

Amphi Jean Jaurès 45 rue d'Ulm, PARIS, France

Generative AI is increasingly presented as a potential substitute for humans, including as human research subjects in various disciplines. Yet there is no scientific consensus on how closely these in-silico clones could represent their human counterparts. While some defend the use of these “synthetic users,” others point towards the biases in the responses provided by the LLMs. Through an experiment using survey questionnaires, we demonstrate that these latter critics are right to be wary of using generative AI to emulate respondents, but probably not for the right reason. Our results […]

Alessandro Laio: Identifying informative distance measures in high-dimensional feature spaces

ENS Salle Dussane

Real-world data  typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Finding a small set of features that still retains sufficient information about the dataset is important for the successful application of many statistical learning approaches. We introduce an approach that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if […]

Thibaut Germain : « A Spectral-Grassmann Wasserstein metric for operator representations of dynamical systems »

Salle W

The geometry of dynamical systems estimated from trajectory data is a major challenge for machine learning applications. Koopman and transfer operators provide a linear representation of nonlinear dynamics through their spectral decomposition, offering a natural framework for comparison. We propose a novel approach representing each system as a distribution of its joint operator eigenvalues and spectral projectors and defining a metric between systems leveraging optimal transport. The proposed metric is invariant to the sampling frequency of trajectories. It is also computationally efficient, supported by finite-sample convergence guarantees, and enables the […]