Chandra Data Science:
Novel Methods in Computing and Statistics for X-ray Astronomy
Presentation Abstracts
- Procedures from mid-20th century statistics, such as nonparametric CUSUM tests and Wald's likelihood-based Sequential Probability Ratio Test, can be effective.
- The problem can be viewed as as multiple changepoint detection in a Poisson time series; this is a challenging modeling problem. Scargle's Bayesian Blocks (widely used in high energy astronomy) for unbinned arrival times and Chibb's Bayesian latent variable method for binned arrival times, are compared.
- The problem can be viewed as a time series of non-homogeneous Poisson images. Xu et al. recently presented a changepoint detection procedure based on minimum description length model selection.
- Allowing for elliptical position errors
- Matching sources detected with significantly different PSFs and allowing for source extent
- Explicit identification of ambiguous matches
- Consistent thresholds across different areas
- Viable simultaneous matching of more than two catalogs
Welcome and Guidelines
Moderate count Chandra X-ray spectra can contain a surprising amount of information, such as the cosmological redshift and the geometry of the obscurer in active galactic nuclei. However, this information is only accessible when fully embracing the Poisson regime, carefully modelling background and source spectral components, combined with robust, modern Bayesian algorithms that explore the model parameter space. With exemplary science highlights I will demonstrate techniques for constraining physical parameters, obtaining underlying sample distributions and comparing competing physical models. These techniques enable the systematic analysis of tens of thousands of eROSITA survey sources.
The goal is to improve the sensitivity of Chandra HRC by expanding on Grant Tremblay's success with the hyperscreen algorithm, which improves on the original background rejection algorithm from Murray+ 2000. Here, we explore semi-supervised techniques; these methods utilize multiple axes of information from the event files to assign probabilities that individual events are real (science) or fake (background).
The spectra of accreting systems such as active galactic nuclei and ultraluminous X-ray sources commonly show spectral lines of plasma ionised by the extreme radiation field. Its properties are unknown a priori, and often the plasma can be in motion with respect to the observer (e.g. an outflowing disc wind), potentially leading to significant Doppler shifts, complicating the line identification. Hence the plausible interpretation of some observed spectral features with a plasma model may span a very large parameter space of ionisations, column densities and systematic/turbulent velocities. Searching through such a large parameter space increases the chances that certain parameter combinations will model Poisson noise in the observed spectrum. This leads to spurious plasma/line detections which appear significant but correspond to no physical emission or absorption. Therefore, to correctly determine the detection significance of any unexpected spectral features, a systematic approach must be applied to tackle this issue called the look-elsewhere effect. I will describe such plasma search technique and its recent successful applications in detection of fast winds in AGN and ULXs.
Training datasets for machine learning rely on catalogs of confidently classified sources. Ensuring these catalogs remain up to date is a tedious and time consuming task. We will discuss our efforts towards building X-ray source catalogs that can be vetted and kept up to date by the community as new sources are discovered, ensuring high quality catalogs for population studies/training datasets.
Radio-loud quasars (RLQs) are more X-ray luminous than predicted by the X-ray–optical/UV relation for radio-quiet quasars(RQQs). The excess X-ray emission depends on the properties of radio jets. We perform large-scale archival Chandra/XMM-Newton data mining to investigate the X-ray–optical/UV–radio relation of optically selected RLQs. Model selection using information criteria supports the scenario where the disk/corona instead of the jets dominate the X-ray emission, which challenges 35 years of thinking about the basic nature of the nuclear X-ray emission of RLQs. A distinct jet component is likely important for only a small portion of flat-spectrum radio quasars. The corona–jet, disc–corona, and disc–jet connections of RLQs are likely driven by independent physical processes. Furthermore, the corona–jet connection implies that small-scale processes in the vicinity of supermassive black holes, probably associated with the magnetic flux/topology instead of black hole spin, are controlling the radio-loudness of quasars.
In recent years, advances in the field of large-scale, high-cadence, time-domain surveys have provided unprecedentedly rich and diverse data sets, and led to the discovery of new types of transients. Understanding the origin and nature of transients is among the key science cases for the deployment of future missions surveying the sky across the electromagnetic field. Transient phenomena that emit in the X-ray are extremely diverse and vary over a broad range of timescales. I will give a review of several methods currently used for the detection of transient X-ray sources. I will then focus on the challenges of applying traditional matched filtering techniques in the low-number count Poisson noise regime and will introduce a solution to this problem derived by Ofek & Zackay, which was demonstrated to be optimal. I will then share preliminary results from the first implementation of this technique on the Chandra data and present a new catalogs of X-ray sources, detected in twenty years of Chandra archival data.
We present early results of a Chandra Cycle 21 LP observation of the galaxy cluster RBS 797, whose previous X-ray studies revealed two pronounced X-ray cavities in the east-west (E-W) direction. Follow-up VLA radio observations of the central active galactic nucleus (AGN) uncovered different jet orientations, with radio lobes filling the E-W cavities and perpendicular jets showing emission to the north-south (N-S) direction extending up the same scale (≈30 kpc). With the new ~427 ks total exposure we report the detection of two additional X-ray cavities in the N-S direction at nearly the same radial distance as the E-W ones. The newly discovered N-S cavities are associated with the radio emission detected in archival VLA data, making RBS 797 the first galaxy cluster found to have four equidistant, centrally-symmetric, radio-filled cavities. We find that the two outbursts are approximately coeval, with an age difference of ≈1 Myr (X-ray data) and <1-10 Myr (radio data). To explain these results we consider either the presence of a binary AGN which is excavating coeval pairs of cavities, or a fast (≪1 Myr) jet reorientation event which produced subsequent, misaligned outbursts.
In this presentation, we will discuss our previous work using random forests to estimate the number of underlying thermal components in ICM spectra. We will also discuss our ongoing work to use machine learning to extract thermodynamic parameters (i.e. temperature and metallicity) directly from Chandra spectra.
Observations of Hercules X-1 with Swift/BAT, MAXI and AstroSat's LAXPC, SXT and UVIT instruments are analyzed. Better coverage of the 35-day cycle that previous observations allows detection of new features in the 35-day multi-band light-curves.
I will show a new method of probing the plane-of-the-sky orientation of magnetic fields in clusters using the intensity gradients of X-ray map. We utilize Chandra X-ray images of the Perseus, M87, Coma, and A2597 galaxy clusters. We find that the fields predominantly follow the sloshing arms in Perseus, which is in agreement with numerical simulations.
An X-ray study of the deeply embedded Wolf-Rayet star, WR 121a, has been carried out using long-term (~12 yr) archival observations from Chandra and XMM-Newton. For the first time, a hint of variable X-ray emission was found in WR 121a from long term X-ray data where the X-ray flux differs by a factor of ~2 from minimum to maximum in 0.3-10.0 keV energy band. The X-ray spectrum of WR 121a has been well-explained by a thermal plasma emission model with temperatures of 1.0±0.3 keV and 3.6±0.7 keV for the cool and hot components, respectively and non-solar abundances. An X-ray luminosity of ~10^34 erg/s associated with WR 121a makes it one of the brightest massive binaries. We suggest that the interaction of stellar wind of both the binary components is responsible for the origin of hard X-ray emission from this system. Further, the collision of primary and secondary stars winds has been found to be radiative. The most probable reason behind the X-ray flux variations as well as different dynamical processes involved in the wind collision of WR 121a will be covered by this contribution.
Detection of templates (e.g., sources) embedded in low-number count Poisson noise is a common problem in astrophysics, including source detection in X-ray images. The solutions in the X-ray-related literature are sub-optimal in some cases. We derive the optimal statistics for template detection in the presence of Poisson noise. We demonstrate that, for known template shape, this method provides higher completeness, for a fixed false-alarm probability value, compared with filtering the image with the point-spread function (PSF), or the Mexican-hat wavelet. For some background levels, our method improves the sensitivity of source detection by more than a factor of two over the Mexican-hat wavelet filtering. This filtering technique can also be used for fast PSF photometry and flare detection; it is efficient and straightforward to implement.
We present a systematic cross-correlation of 5 different X-rays source catalogs, including the Chandra Source Catalog 2, to search for X-ray long-term variability. Two major interests of such a method are the wealth of different variable sources that can still be uncovered in the archival data, and the future use of this method in the XMM-Newton pipeline to alert the community in quasi real time.
Many astrophysical phenomena are time-varying, in the sense that their intensity, energy spectrum, and/or the spatial distribution of the emission suddenly change. We developed a method for modeling a time series of images. The proposed method is designed to estimate the number and the locations of all the change points (in time domain), as well as all the unknown piecewise constant functions between any pairs of the change points. The method applies the minimum description length (MDL) principle to perform this task. A practical algorithm is also developed to solve the corresponding complicated optimization problem. Simulation experiments and applications to real datasets show that the proposed method enjoys very promising empirical properties. Applications to two real datasets, the XMM observation of a flaring star and an emerging solar coronal loop, illustrate the usage of the proposed method and the scientific insight gained from it.
Cosmology is entering an era of data-driven science, due in part to advancements in modern machine learning techniques that enable powerful new data analysis methods. This is a shift in our scientific approach, and requires us to ask an important question: Can we trust the black box? I will present a deep machine learning (ML) approach to constraining cosmological parameters from X-ray surveys of galaxy clusters. The ML approach has two components: an autoencoder that builds a compressed representation of each galaxy cluster and a flexible convolutional neural network to estimate the cosmological model from a cluster sample. From mock observations, the ML method estimates the amplitude of matter fluctuations, sigma8, at approximately the expected theoretical limit. More importantly, the deep ML approach can be understood and interpreted, and model interpretation led to the discovery of a previously unknown self-calibration mode for flux- and volume-limited cluster surveys. I will describe this new mode, which uses the amplitude and peak of the cluster mass PDF as an anchor for cluster mass calibration.
In this work we present the results of one of the deepest X-ray survey (1.75 Ms) carried out on the same field by XMM-Newton. We detected 301 point-sources that were cross correlated with an optical source list extracted by our GTC follow-up, and complemented by WISE/2MASS IR data. We identified 244 optical/IR counterparts that enable us to determine that 204 source are AGNs.
Despite the progress with modelling non-uniform sampled light curves, having a strict uniformity still holds its edge in various analyses. I will discuss a method that can construct uniform sampled light curves with or without the present of data at multiple X-ray bands. The method can take the PSF or the response function as a weight when calibrating the light curves to any energy band.
The ALeRCE broker is processing the alert stream from the Zwicky Transient Facility (ZTF) and is an official Community Broker for the Legacy Survey of Space and Time (LSST). We use Cloud Infrastructure and Machine Learning to bring real-time processed products and services to the astronomical community. Thus combined with Chandra, ALeRCE can be used for multimission and multiwavelength studies.
I present the results of our latest systematic study of short GRBs at late times in X-rays using the available Swift, XMM-Newton and Chandra data of all events detected up to date (February 2021). I also introduce updated constraints on the jet opening angles of such events utilizing all the broad-band afterglow information encompassing the reanalyzed X-ray data and the available radio, optical and near-Infrared information in the literature. I present the most updated opening angle distribution of short GRBs, and discuss the implications for GRB energy scales and progenitor (merger) rates.
We analyse Chandra data of HD179949 to study temporal variability of the intensity and spectrum. We compute hardness ratio light curves using a Bayesian method to show spectral variability is ubiquitous. We demonstrate methods to establish the presence of spectro-temporal variability. For analysing variability, we used the spectral fitting models with the least variance in the Cash-statistic. The author acknowledged the support of IISER Mohali and guidance of their MS thesis supervisors: Prof. Kulinder Pal Singh and Dr. Vinay Kashyap.
In this work, we present the detection of near-infrared (NIR) emission from low-luminosity AGNs (LLAGNs). We examine the dependence of NIR and (2-10)keV X-ray luminosity as a function of the black hole accretion rate, and find an excess of NIR emission at the lowest Eddington ratios.
X-ray source catalogs from the Chandra and XMM-Newton observatories, and most recently form the eROSITA all-sky survey, contain large numbers of sources. A uniform and statistically sound analysis of their content requires automated methods to classify and characterize the sources. Such attempts typically make use of carefully designed match catalogs and I will show that the number of true matches can be measured very accurately for any match catalog just from the ensemble of match distances even without considering any physical properties. I will then continue with machine learning and Bayesian techniques that deal with the stellar content of X-ray catalogs; and how both method provide accurately calibrated match probabilities. Specifically, a classifier using a support vector machine is able to learn the target properties from small samples and performs about equally well as Bayesian techniques. Both methods reach about 90% completeness and reliability in the eROSITA performance verification fields. Finally, I will discuss a few astrophysical questions that greatly benefit from such large, well calibrated samples such as the local star formation history.
Accurate measurement of X-ray energies with CCDs relies on many factors, including pixel size, detector thickness, readout noise, and our understanding of the physical processes governing photon collection. Past and current missions, including Chandra, have used an empirical approach to model detector response, including a morphological metric or “grade” to discriminate particle events from celestial X-rays and to characterize the energy of those X-rays. Future thick, small-pixel solid state imagers pushing to soft X-ray energies will confront challenges with this approach. In this talk, I will present work from our group, which provided and calibrated the ACIS CCDs, to develop and understand new event characterization techniques. We have employed realistic simulations of soft X-ray detection in CCDs, validating the results through comparison to existing Chandra data and to lab experiments currently underway. We study a variety of event reconstruction algorithms and detector settings to optimize the X-ray response. These algorithms hold great promise for future X-ray imaging instruments, and the work has deepened our understanding of current CCDs such as those on Chandra.
RT Cru is a hard X-ray emitting symbiotic binary system consisting of a hot accrediting compact core and a red giant mass donor. We have carried out spectral data analysis of X-ray observations of RT Cru taken by the High Energy Transmission Grating (HETG) in 2005 and the Low Energy Transmission Grating (LETG) on Chandra X-ray Observatory in 2015. We employed an MCMC-based Bayesian statistic method (pyBLoCXS) to model thermal plasma components producing emission features in the LETG and HETG observations. Our statistic analysis implies the presence of soft (1.3 keV) and hard (9.6 keV) thermal plasma components in the LETG and HETG data, respectively. The soft thermal component detected in the LETG is heavily obscured by dense absorbing materials, and it could be associated with either an unseen jet or a shocked wind that deserves further investigations.
Detecting extended emission from X-ray sources generally requires high count rates and/or long exposures. However, comparing accurate models of the point spread function (PSF) to observational data can reveal extended X-rays with limited count statistics. I will discuss recent work applying Chandra PSF models from CIAO/MARX to detect extended X-rays in snapshot observations of high-redshift quasars.
X-ray telescopes like the Chandra X-ray Observatory have identified thousands of X-ray sources [317,167 sources identified in Chandra Source Catalog (Evans et al. 2020)]. These include isolated neutron stars, X-ray binaries, active galactic nuclei (AGN), supernova remnants, and chromospherically active stars. Classifying these X-ray sources will allow us to study key phenomena like the population distribution of compact objects across the Galaxy. Detailed spectral analysis to identify the type of individual X-ray sources is tedious for large X-ray surveys, and standard hardness ratios are not accurate enough (e.g., Hebbar et al. 2019). Here, we explore the use of machine learning to differentiate the spectra of active stars from AGN. We generated 100,000 fake spectra of active stars and AGN using spectral fits from Chandra Orion Ultradeep Project (Getman et al. 2005) and Chandra Deep Field South AGN (Tozzi et al. 2006) surveys and trained artificial neural networks to distinguish them. Our algorithm classifies the AGN and active star spectra with ~90% accuracy. Our results show that machine learning tools could provide an efficient way to classify X-ray sources in large catalogs.
NGC 281, the star forming complex (a.k.a pacman nebula) ) is at a close distance of 2.1kpc and is 300 pc above the galactic plane providing an unobscured view of star formation. We present a study of this region using Chandra, GMRT and Spitzer
Supernova remnants (SNRs) offer the means to study supernovae (SNe) long after the original explosion and provide a unique insight into the mechanism that governs these energetic events. In this talk, I will discuss how spatially-resolved imaging and spectral analysis of X-ray SNR data—capturing reverse-shock heated ejecta metals– can provide insights on SN physics and progenitor properties. Ejecta emission line morphologies –2D and 3D—have been used to distinguish between core-collapse and type Ia origins, and the anti-correlation between ejecta asymmetries and neutron star (NS) velocities has provided evidence of asymmetric explosion mechanisms and NS kick mechanisms consistent with recent 3D simulations. Ejecta abundances and total ejecta masses, as measured through emission line fluxes, have been used to further constrain explosion processes and progenitors. Together, ejecta masses and morphologies enable astronomers to build comprehensive pictures of the distribution of elements in SNRs for better comparison to the predictions of 3D hydrodynamical SNe simulations.
As the multi-messenger era is now fully active, it is crucial that the community has a framework within which to analyze data from multiple messengers, wavelengths, and instruments in a statistically robust, common way. 3ML (https://threeml.readthedocs.io) provides an abstract, plugin-based data interface for instruments to combine analysis through each instrument's own unique likelihood. Users and instrument teams can create or use existing plugins to interface their data to a plethora of Bayesian and optimization packages in a uniform way. Analysis results are reported and stored in portable file formats that allow for the sharing and replication of results in a way that provides observers to produce robust scientific results that the community can interpret. 3ML currently supports, via standard plugins many ground and space-based observatories as well as being the analysis tool for some collaborations (HAWC, XIPE, Fermi-LAT, POLAR, GECAM)
We will use our machine learning pipeline to classify variable sources and explore their population properties (per class) and origins of their variability. We will determine what features are most important for classification and how critical is the multiwavelength information. Our study will inform classifications of variable eROSITA sources where multiwavelength matching will be challenging.
The measurement of X-ray reverberation around black holes, as variations in the continuum emission echo off the accretion disk, have enabled a breakthrough in mapping the extreme environments just outside the event horizon. Reverberation probes the structure of the inner accretion flow and reveals the nature of the corona that produces the X-ray continuum. Measurements of X-ray reverberation via conventional Fourier techniques, however, are limited to relatively modest supermassive black holes in nearby galaxies. Recently-developed statistical techniques, based on Gaussian processes, enable the measurement of X-ray reverberation around more massive black holes in radio galaxies by combining multiple observations to provide a longer time baseline. This framework provides a statistical basis to robustly characterize time lags in lower signal-to-noise data, enabling the measurement of X-ray reverberation in Chandra observations of gravitationally lensed quasars. X-ray reverberation measurements in lensed quasars extend our understanding to rapidly accreting black holes beyond just the local Universe and pave the way for breakthrough science with the next-generation Lynx observatory.
In the nearby LIRG NGC 7552, the majority of the IR emission originates in a circumnuclear starburst ring, which has been resolved into discrete knots of star formation. We present results from recent Chandra and NuSTAR observations of NGC 7552, which reveal a major deficit in the X-ray emission from several star-forming knots and indicate age/metallicity effects as likely causes for the deficit.
Despite the importance of dual active galactic nuclei to wide-ranging astrophysical fields such as galaxy formation and gravitational waves, we are currently limited in our abilities to determine whether a given X-ray observation originates from one or two AGN for sources with small (< 0.5") angular separations. As a result, very few dual AGN have been confirmed, and most have physical separations > 1 kpc. In this presentation, I will review the statistical framework behind BAYMAX (Bayesian AnalYsis of Multiple AGN in X-rays), a code which is capable of statistically determining whether a given Chandra observation is best described by one or more point sources. Specifically, I will review results from BAYMAX analyzing Chandra observations of a variety of dual AGN candidates, and highlight analyses of BAYMAX quantifying the presence of other closely separated and low-count sources, such as jets. With BAYMAX we are discovering a dual AGN population where past spatial resolution limits have prevented systematic analyses. Overall, BAYMAX will be an important tool for correctly classifying candidate dual AGN, and, for first time, studying the dual AGN population across cosmic time.
Cosmological hydrodynamical simulations suggest that the WHIM accounts for the discrepancy in measured local baryon density. Detection of WHIM, although difficult, can be achieved using resonant scattering of radiation from background point sources (individually resolvable) as absorption amplitude scales linearly with WHIM column density. As O VII w resonance line is prominent, located in soft X-ray range and resonantly scattered, with a survey (of ~2 deg^2, ~5 Ms exposure) that can suppress point sources, we can trace intergalactic oxygen independent of temperature and IGM density. We use the CXB spectrum from the 2.15 deg^2, 4.6 Ms exposure CCLS field. After removing X-ray sources, we model spectrum using powerlaw (unresolved sources), APEC (LHB) and broken powerlaw (WHIM resonant scattering) with spectral index ~-3.7. From the norm of the broken powerlaw, we constrain O VII abundance in WHIM to solar abundance levels. Limit can be further constrained by in-depth analysis of foreground emission from LHB, SWCE, direct thermal emission from WHIM. Similar meta-analyses with several Chandra spectra (higher total exposure time) will yield stricter constraints to WHIM O VII abundance.
I will present the most recent developments in our supervised machine-learning approach to X-ray source classification. Specifically, I will talk about the pipeline built to classify CXO sources using X-ray properties provided from CSC 2.0 and multiwavelength properties from infrared, near-IR, and optical catalogs. The pipeline now incorporates the temporal variability features as well as the measurement uncertainties and estimates of confusion for the cross-matching. I will describe how we currently account for the uncertainties and confusion and show the impact of including this additional information on the classification outcomes.
The Sherpa package was developed by the Chandra X-ray Center (CXC) as a general purpose fitting and modeling tool, with specializations for handling X-ray Astronomy data. It is provided as part of the CIAO analysis package, where the code is the same as that available from the Sherpa GitHub page as Python package. We provide an overview of the Sherpa current capabilities with a few examples.
The spatial resolution of Chandra is unprecedented among X-ray telescopes. It is most desirable to make the full use of the information through modern imaging analysis techniques. We present our application of a deep learning approach to classifying synthetic Chandra images of galaxy clusters from IllustrisTNG simulation. Without any spectral information, we are able to identify cool core, weak cool core, and non-cool core clusters of galaxies that are defined by their central cooling times, which outperforms traditional methods. We further employed class activation mapping to localize discriminative regions for the classification decision, which unveiled the connection between the merger history and dynamic state of galaxy clusters.
We process a systematic searching for X-ray quasi-periodic oscillations (QPOs) of AGNs with the method of Lomb-Scargle periodogram in the Chandra deep field south (CDFS), a ~ 7 Ms survey spanning in 16.5 years. No solid detection is confirmed from a sample of ~ 1000 sources. Detection efficiency computed by simulated dataset, helps constrain the persistent QPO fraction in AGN providing the two model of weak (strong) QPOs in AGN with low (high) variability amplitude. Two marginal transient QPO candidates also been reported with period about 13273s and 7065s, though with significance lower than 3$\sigma$ due to the large number of trials.
As X-ray astronomy enters the era of Big Data, a reliable, interpretable and automated classification of X-ray sources becomes increasingly valuable. While hard and fast rules often lead to inaccurate classifications, more black-box approaches such as the random forest algorithm perform better but are difficult to interpret. In my contribution I will present how we developed a probabilistic classification inspired from the Naive Bayes Classifier, showing good, interpretable results on Swift and XMM catalogues, and yet to be tested on Chandra sources. In the last part of my talk, I will show how we can address the very small populations in some object classes using citizen science, with the development of a new platform designed for the classification of XMM sources by volunteers, but also adaptable to the Chandra catalogue.
We compare the observed population of BeXRBs in the SMC with populations of BeXRB-like systems simulated by the COMPAS code. Assuming that BeXRBs experienced only stable mass transfer, their masses suggest that at least ∼ 30% of the mass donated by the progenitor of the neutron star is typically accreted by companion. These results affect rate predictions of double compact object mergers.
We employ unsupervised learning methods, including K-means and Gaussian Mixtures applied to a list of X-ray properties, to classify high energy sources in the Chandra Source Catalog (CSC). Given the relatively small fraction of the sources that have been independently classified, unsupervised methods offer a suitable approach to probabilistically assigning classes to over 315,000 X-ray sources. We achieve this by associating specific K-means and GM clusters with those CSC objects that have a classification in the SIMBAD database, and then assigning probabilistic classes by association to unclassified objects based on a few cluster metrics. We are able to successfully identify clusters of previously identified objects that likely belong to the same class, and even within groups that were previously classified under general, such as "galaxies", "QSO", "YSO", we find sub-classes related to their unique variability and spectral properties. The result of this exercise is a robust probabilistic classification (i.e. a posterior over classes) for a large fraction of CSC sources.
The analysis of individual X-ray sources that appear in a crowded field can easily be compromised by the misallocation of recorded events to their originating sources. Even with a small number of sources, that nonetheless have overlapping point spread functions, the allocation of events to sources is a complex task that is subject to uncertainty. We develop a Bayesian method designed to sift high-energy photon events from multiple sources with overlapping point spread functions, leveraging the differences in their spatial, spectral, and temporal signatures. The method probabilistically assigns each event to a given source. Such a disentanglement allows more detailed spectral or temporal analysis to focus on the individual component in isolation, free of contamination from other sources or the background. We apply our methods to two stellar X-ray binaries, UV Cet and HBC 515 A, observed with Chandra. We demonstrate that our methods are capable of removing the contamination due to a strong flare on UV Cet B in its companion approx. 40 times weaker during that event, and that evidence for spectral variability at timescales of a few ks can be determined in HBC 515 Aa and HBC 515 Ab.
The Supernova X-Ray Database (SNaX) was established a few years ago to make X-ray data on supernovae (SNe) publicly available via an elegant searchable web interface. The database is moderated, and contains only data from published sources. The interface is searchable, and a graphical user interface is provided to make and download plots, which are highly customizable. A template allows users to easily submit data to the repository for storage and retrieval. As far as possible, fluxes and luminosities are given in a standard wave band (0.3–8 keV) so that they can be easily compared. The data are stored in a MySQL relational database. The web interface has recently been updated to PhP7, had security updates done, and moved to a new server, ensuring its long-term stability. Users are encouraged to upload their X-ray data to the database, and are free to download the data they may need in their work. The database can be used to explore the properties of the X-ray emission from young SNe, the implications for the SN environment, progenitor mass-loss and the identity of the progenitors.
Heat [1] is a Python library for scientific Big Data analytics, meant to bridge the gap between NumPy/SciPy users and high-performance computing. Under the hood, Heat operations and machine learning / deep learning algorithms are optimized to exploit the available resources via MPI. At the same time, HeAT’s NumPy-like API makes it straightforward for SciPy users to implement HPC applications, or to parallelize their existing ones. HeAT relies on PyTorch for its data objects, implying efficient on-process operations, GPU support, and array differentiability. In this talk, I will show you some of HeAT's inner workings, and a few use cases in applied science. I will also touch on the points of contact and collaboration possibilities within the Chandra community. [1] https://github.com/helmholtz-analytics/heat
Chandra observed Betelgeuse during the Great Dimming. No X-ray emission was detected. We will discuss how upper limits are estimated from the data and report on the resultant constraints on the X-ray luminosity and absorption column.
Standard statistical procedures like the chi-squared test can be used for detecting variations in real-valued brightness measurements. But more specialized methods are needed when the data consist of photon arrival times. We review several approaches with applications to Chandra observations of flaring pre-main sequence stars:
X-ray spectro-imagers can produce a 3D (X,Y,Energy) view of extended sources like supernova remannts or galaxy clusters. In those objects physical components (e.g. cold, hot plasmas) are nested and projected on the sky. Using a new type of component analysis, we were able to disentangle red/blue shifted components in the CasA SNR Chandra data and provide constraints on the explosion mechanism.
Two bright X-ray transients were reported from the Chandra Deep Field South (CDF-S) archival data, namely CDF-S XT1 and XT2. We proposed a unified model to interpret both transients within the framework of the BNS merger magnetar model. We fit the magnetar model to the light curves of both transients and derived consistent parameters for the two events.
The Python X-ray Spectral Interpretation System (pyXsis) is a light weight python library for loading and visualizing X-ray spectrum files built upon the Astropy specutils package. It is very much a work in progress, and I would like advertise it to people who are looking for options to merge astronomical Python tools with X-ray spectral analysis.
We discuss a novel method, based on machine learning, to recover the 3D temperature profiles in galaxy clusters from the observed 2D (projected) profiles. We have developed a fully non-parametric regularization procedure, which takes into account both projection and PSF effects. The approach is based on a data-driven learnt model obtained from an auto-encoder network that aims to capture the intrinsic low-dimensional, non-linear nature of the temperature profiles. The resulting generative procedure can be applied to estimate 3D temperature profiles from noisy 2D observations, and can in principle be adapted to recover any thermodynamic quantity. In a first approach, learning has been performed using the 3D temperature profiles of more than 160 clusters observed with XMM-Newton, derived from MCMC parametric fits. We are extending our analysis by incorporating 3D temperature profiles from simulations in the learning stage to assess possible biases. We present and discuss the first results of our new technique tested on the representative sub-sample of CHEX-MATE (Cluster HEritage project with XMM-Newton-Mass Assembly and Thermodynamics at the Endpoint of structure formation) sample.
We present a methodology for numerically modeling astrophysical systems with 3D magneto-hydronamic simulations, to sythesize X-ray spectra and line diagnostics for direct comparison with observations. We use 3D simulations and very deep Chandra HETG spectra of the magnetic O star theta 1 Ori C to demonstrate our approach.
Axions are hypothetical particles arising in string theory scenarios, as a solution to the strong CP problem, and as a dark matter candidate. Axions may interact weakly with the photon, interconverting with photons in the presence of a background magnetic field. This effect allows us to search for axions with Chandra data by searching for spectral modulations in point sources shining through the magnetic fields of galaxy clusters. This technique has been used to place strong bounds on the axion photon coupling. However, the approach is limited by our imprecise knowledge of cluster magnetic fields. I will discuss possible improvements to axion searches with Chandra using machine learning techniques.
Thanks to XMM and Chandra we have now 20 years of experience in identifying the counterpart to X-ray sources. I will show how the acquired knowledge has been used for identifying reliable counterparts to ROSAT and eROSITA by combining Machine Learning and Bayesian statistic, regardless the classification of the sources. I will then compare the results with those obtained with standard methods and offer some suggestion on how the method can further improve.
The 2008 paper by Budavari & Szalay proposed an excellent Bayesian algorithm for spatial crossmatching of source catalogs. But it left us with several issues relevant for the Chandra Source Catalog:
The first four of these have been implemented in the current version of Xmatch. Issue 2 is particularly relevant for the CSC, because Chandra’s PSF varies very significantly across the field of view. We have combined source size and PSF by matching sources also on “raw size,” which would for point sources is their PSF. This is an important point of consideration for issue 3. Ambiguously matched clusters are inspected to identify unique pairs that stand out among their peers. Consistency in probability thresholds is achieved by applying a self-consistency criterion (by Budavari & Szalay). Dan Nguyen has developed a solution for issue 5, but this has not yet been implemented in the current version.
The XMM-Newton Survey Science Centre consortium provides catalogues of serendipitously detected sources in all observations taken since early 2000. Meanwhile, more than 600,000 X-ray sources were observed up to 80 times. In the repeatedly visited sky areas, additional stacked source detection is performed, and the results are published in the source catalogue from overlapping observations. Through the long effective exposure time of the simultaneously processed observations, higher sensitivity and better constrained source parameters can be achieved than in source detection on individual observations. Stacking also permits to directly assess the inter-observation variability of sources and to derive fluxes also in those observations in which a source would not be detected individually. The talk will introduce the recently released catalogue versions and highlight examples of long-term variability.
Multiwavelength identifications of extragalactic X-ray sources are essential for using these sources to understand their parent galaxies. Can we use machine learning with the X-ray properties alone to identify promising candidates for follow-up? We trained several machine learning algorithms on Chandra observations – including photometry in non-standard X-ray bands -- of M31 and found random forest classifiers to give the best performance. This talk will discuss both the algorithms and performance metrics, as well as prospects for future studies.
Future X-ray observatories such as Athena, Lynx, and AXIS have ambitious science goals, including mapping hot gas structures in the universe, such as clusters and groups of galaxies and the intergalactic medium. These deep observations of faint diffuse sources are limited by the statistical and systematic uncertainties in the background produced by high energy particles.
I will discuss efforts to better understand the particle environment experienced by existing X-ray observatories, such as Chandra and XMM, and simulations of future missions like Athena, including how these particles interact with the spacecraft to produce events in the detector. This includes revisiting standard event detection and filtering processes and whether improvements could be made that would reduce the particle-induced background in the final data or increase our knowledge of that background and thus reduce the systematic error.
Chandra and other observatories observe X-ray sources projected onto the sky, but numerical modeling of the same sources is done in 3D, and focuses on the evolution and characteristics of physical quantities (such as density, temperature, and metallicity) which we only can infer via X-ray emission mechanisms. Bridging this gap requires mock observations. In this talk, I will present pyXSIM, a Python package which takes 3D simulations of astrophysical objects from a variety of source types and produces realistic mock X-ray observations which can be convolved with instrument models such as Chandra, XMM-Newton, XRISM, Athena, and Lynx. I will also discuss SOXS, a related Python package which provides a number of tools to support producing mock X-ray observations, in particular simulating the responses of various instruments. I will end by discussing selected science examples which have been enabled by these two pieces of software.
X-ray surveys provide spectra for numerous sources with low information with respect to their spectral complexity. We developed a Bayesian method to fit spectra automatically using a consistent physical model and informative priors, and to reconstruct the distributions of the main spectral parameters in an unbiased way. The method has been developed for XMM, but can be easily adapted to Chandra.
Detection of templates (e.g., sources) embedded in low-number count Poisson noise is a common problem in astrophysics, including source detection in X-ray images. The solutions in the X-ray-related literature are sub-optimal in some cases. We derive the optimal statistics for template detection in the presence of Poisson noise. We demonstrate that, for known template shape, this method provides higher completeness, for a fixed false-alarm probability value, compared with filtering the image with the point-spread function (PSF), or the Mexican-hat wavelet. For some background levels, our method improves the sensitivity of source detection by more than a factor of two over the Mexican-hat wavelet filtering. This filtering technique can also be used for fast PSF photometry and flare detection; it is efficient and straightforward to implement.
In recent years, technological advances have dramatically increased the quality and quantity of data available to astronomers. Newly launched or soon-to-be launched space-based instruments provide massive new surveys resulting in new catalogs containing terabytes of data, high resolution spectrography and imaging across the electromagnetic spectrum, and incredibly detailed movies of dynamic and explosive processes in the solar atmosphere. Using such data to learn about, for example, the underlying physical processes of astronomical sources often requires a sequence of statistical analyses, the outputs of earlier analyses being fed into subsequent analyses. Typically this involves a combination of data-driven methods that are agnostic to the underlying physics with model-based methods that are tailored to specific scientific questions. In this talk we explore how this plays out in the context of several examples taken from high-energy astrophysics. We will discuss how analyses based on pragmatic approximations compare with fully-Bayesian analyses, all with the goal of obtaining a coherent overall statistical analysis.
Active Galactic Nuclei (AGN) significantly impact the evolution of their host galaxies by expelling large fractions of gas with wide-angle outflows. The X-ray band is key to understanding how these winds affect their environment, because they are heated to high, X-ray temperatures. In this talk, I will introduce our Bayesian framework for characterizing AGN outflows, which provides substantial improvements in our ability to explore parameter space and perform robust model selection. We applied this framework to new and archival deep Chandra HETG observation of the Seyfert galaxy NGC 4051. We detected six components, spanning velocities from 100s to 10,000s km/s, and mapped their evolution across an eight year period. The most significant wind component is collisionally ionized and remains remarkably stable between the two epochs. This is the first detection in absorption of such an AGN outflow, which was enabled by using a Bayesian approach. The estimated total outflow power surpasses 5% of the AGN bolometric luminosity, making it important in the context of galaxy-black hole interactions.
The construction of a reliable training dataset is essential to the classification of sources using supervised machine learning. Distance to the source is a key feature that is difficult to incorporate into training datasets due to large uncertainties. It is thus rarely used as a feature in the classification of X-ray sources. We incorporate distance information from the Gaia eDR3 distance catalog into the training dataset of our random-forest machine-learning pipeline, and explore the impacts on the classification of CXO sources in the field of several open clusters.
HMXBs present an opportunity to study winds of massive stars, by measuring the wind's X-ray absorption. Due to the high variability induced by their structure, analyses need to incorporate both spectroscopic and timing data, a task at which excess variance (RMS) spectra excel. I present a first-time RMS study of Cyg X-1 Chandra HETG data, highlighting results, first models, and future prospects.
We estimate the coronal density of Capella using the O VII line systems in the soft X-ray regime over the course of the Chandra mission. We combine measures of uncertainty due to atomic data with statistical uncertainty and explore the effect of different emission measure distributions to derive meaningful uncertainties on the plasma density on the coronae of Capella.