Computational Chemistry

Nature is fond of geometry, and different molecular structures show unique geometric traits. However, training neural networks to predict properties of such geometric structures comes with a unique set of challenges. Our ongoing work focuses on accelerating the training of such neural networks on emerging AI accelerators, heterogeneous computing that integrates simulations and machine-learning on supercomputers, and creative combination of different machine-learning techniques to support wider range of chemical systems.

A Journey with Graphs: Connecting the Dots

What do chemical compounds, power grid and Wikipedia have in common? All of them are a manifestation of diffrent entities coming togeher, something that is commonly modeled as graphs, or networks. A significant part of my career is dedicated to "connecting the dots": learning and building tools that often map into diverse fields in computer science: data mining techniques to extract structural patterns, searching a graph database for complex patterns, semantic reasoning using knowledge graphs, predictive and generative machine learning methods for graphs, and explaining decisions on complex multi-modal data in natural language. StreamWorks, a system we developed for continuous pattern detection and reasoning received a R&D100 award in 2018 for it's application into cyber-security.

Medical Informatics

The field of digital medicine, algorithmic design of intelligent software to measure, intervene and improve human health fascinates me. Our work ranges from development of new machine learning methods to enable transfer learning across diseases, improving predicting modeling via phenotype discovery, and designing dialog-based AI systems that can reason about it's conclusions and recommendations.

Additional Details

News, Publications, Patents and Talks

News, Recent Talks and Organizational Activities

  1. Oct 2023 - Stanford Graph Learning Workshop
  2. Sep 2023 - ChemReasoner selected for Microsoft Accelerate Foundation Models Research Initiative!
  3. Sep 2023 - DOE INCITE Panel Review
  4. Sep 2023 - Invited Talk, AI Hardware and Edge AI Summit, Santa Clara
  5. May 2023 - HydroML Workshop, Lawrence Berkeley National Laboratory
  6. Feb 2023 - SciML Group, Science and Technology Facilities Council, UK
  7. Feb 2023 - Institute for Data Science at New Jersey Institute of Technology
  8. Nov 2022 - Graphcore@Supercomputing Conference
  9. Nov 2022 - Webinar: Accelerating Molecular Graph Neural Networks
  10. Oct 2022 - National Research Data Infrastructure (NDFI) for Catalysis, Germany
  11. Sep 2022 - Washington State University Data Day
  12. Jul 2022 - AI4Science and Security Workshop, UC Davis
  13. Jun 2022 - GraphConnect, Austin, TX
  14. Mar 2022 - VA Long COVID Workgroup
  15. Mar 2022 - School of Computational Science and Engineering, Georgia Institute of Technology
  16. Mar 2022 - AAAI Workshop on AI-Based Design and Manufacturing

Publications/Contributed Talks

  1. Sprueill H.W., U. Sanyal, M.V. Olarte, C. Edwards, H. Ji and S. Choudhury. 2023. "Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design" Findings of the Association for Computational Linguistics: EMNLP 2023. code and data
  2. Chen A.R., C.M. Ham, S. Choudhury, and K. Agarwal. 07/27/2023. "Leveraging General Knowledge to Improve Patient Embeddings.", PNNL-SA-187615
  3. Bilbrey J.A., J.S. Firoz, H.W. Sprueill, and S. Choudhury. 07/14/2023. "Fast and Accurate Prediction of Potential Energy Functions for Molecular Structures." Presented by J.A. Bilbrey at Seagate AI/ML Virtual Speaker Series.
  4. Herman K.M., J.A. Bilbrey, H.W. Sprueill, L. Ward, S. Choudhury, and S.S. Xantheas. 06/20/2023. "Development of robust ab initio-based neural network potential energy surfaces for aqueous systems based on transfer and active learning strategies." Presented by K.M. Herman at CECAM Psi-k Research Conference, Berlin, Germany, Washington.
  5. Engel A.W., T.Y. Chiang, S. Choudhury, A.D. Sarwate, Z. Wang, I. Dumitriu, and N. Frank. 2023. "Robust Explanations for Deep Neural Networks via Pseudo Neural Tangent Kernel Surrogate Model."
  6. Khatir M., N. Choudhary, K. Agarwal, S. Choudhury, and C. Reddy. 2023. "Pseudo-Poincaré: A Unification Framework for Euclidean and Hyperbolic Graph Neural Networks." In 2023 International Joint Conference on Artificial Intelligence (IJCAI).
  7. Chen J., H. Sung, X. Shen, S. Choudhury, and A. Li. 2023. "BitGNN: Unlocking the Performance Potential of Binary Graph Neural Networks on GPUs." In International Conference on Supercomputing (ICS-2023)
  8. Ward L., J. Pauloski, R. Chard, Y. Babuji, G. Sivaraman, S. Choudhury, and K. Chard, et al. 2023. "Efficient AI-Guided Simulation Workflows across Heterogeneous Systems via Function-as-a-Service and Object Proxies." The thirty-second Heterogeneity in Computing Workshop (HCW), International Parallel and Distributed Processing Symposium (IPDPS)
  9. Choudhury S., K. Agarwal, C.M. Ham, and S. Tamang. 2023. " MediSage: An AI Assistant for Healthcare via Composition of Neural-Symbolic Reasoning Operators." The Web Conference.
  10. Helal H., J.S. Firoz, J.A. Bilbrey, M.M. Krell, T. Murray, A. Li, and S.S. Xantheas, et al. 2022. "Extreme Acceleration of Prediction Models for Quantum Chemistry Over Molecular Graph Databases."
  11. Bilbrey J.A., H.W. Sprueill, K.M. Herman, S.S. Xantheas, P. Das, M. Lopez Roldan, and M. Kraus, et al. 2022. "Reducing Down(stream)time: Pretraining Molecular GNNs using Heterogeneous AI Accelerators." In Machine Learning and the Physical Sciences Workshop at the 36th conference on Neural Information Processing Systems (NeurIPS).
  12. Mukherjee S., S. Nandanoori, S. Guan, K. Agarwal, S. Sinha, S. Pal, and S. Kundu, et al. 2022. "Learning Distributed Geometric Koopman Operator for Sparse Networked Dynamical Systems." In Learning on Graphs (LoG) conference.
  13. Agarwal K., S. Choudhury, S. Tipirneni, P. Mukherjee, C.M. Ham, S. Tamang, and M. Baker, et al. 2022. "Preparing for the next pandemic via transfer learning from existing diseases with hierarchical multi-modal BERT: a study on COVID-19 outcome prediction." Nature Scientific Reports 12.
  14. Engel A.W., T.Y. Chiang, Z. Wang, A.D. Sarwate, and S. Choudhury. 2022. TorchNTK: A Library for Calculation of Neural Tangent Kernels of PyTorch Models. PNNL-32867.
  15. Wang P., T. Shi, K. Agarwal, S. Choudhury, and C. Reddy. 2022. "Attention-based Aspect Reasoning for Knowledge Base Question Answering on Clinical Notes." Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics.
  16. Nandanoori S., S. Guan, S. Kundu, S. Pal, K. Agarwal, Y. Wu, and S. Choudhury. 2022. "Graph Neural Network and Koopman Models for Learning Networked Dynamics: A Comparative Study on Power Grid Transients Prediction." IEEE Access 10.
  17. Nandanoori S., S. Pal, S. Sinha, S. Kundu, K. Agarwal, and S. Choudhury. 2021. "Data-driven Distributed Learning of Multi-agent Systems: A Koopman Operator Approach." In IEEE 60th Conference on Decision and Control (CDC 2021)
  18. Alexander F.J., J.A. Ang, S. Choudhury, S. Ghosh, Y. Huang, N. Kumar, and J.A. Bilbrey, et al. 2021. "Co-design Center for Exascale Machine Learning Technologies (ExaLearn)." The International Journal of High Performance Computing Applications.
  19. Guan S., H. Ma, S. Choudhury, and Y. Wu. 2021. "GEDet: Detecting Erroneous Nodes with A Few Examples." In 47th International Conference on Very Large Data Bases.
  20. Choudhury S., K. Agarwal, C.M. Ham, P. Mukherjee, S. Tang, S. Tipirneni, and C. Reddy, et al. 2021. "Tracking the Evolution of COVID-19 via Temporal Comorbidity Analysis from Multi-Modal Data." In AMIA 2021 Annual Symposium.
  21. Bilbrey J.A., J.M. Brandi, M. Ashby, and S. Choudhury. 07/20/2021. "Geometric learning of molecular conformational dynamics." SIAM Conference on Discrete Mathematics 2021, Spokane, Washington.
  22. Bilbrey J.A., L. Ward, S. Choudhury, N. Kumar, and G. Sivaraman. 2021. "Evening the Score: Targeting SARS-CoV-2 Protease Inhibition in Graph Generative Models for Therapeutic Candidates." In ICLR 2021 Workshop on Machine Learning for Preventing and Combating Pandemics.
  23. Nandanoori S., S. Kundu, S. Pal, S. Choudhury, and K. Agarwal. 2021. "Nominal and adversarial synthetic PMU data for standard IEEE test systems."
  24. Wang, P., Agarwal, K., Ham, C., Choudhury, S. and Reddy, C.K., 2021. Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks., In Proceedings of The Web Conference 2021.
  25. U.S. Patent No. 10,855,706 - System and methods for automated detection, reasoning and recommendations for resilient cyber systems- Issued 12/01/20. Sutanay Choudhury, Kushbu Agarwal, Pin-Yu Chen, Indrajit Ray.
  26. U.S. Patent No. 10,810,210 - Performance and usability enhancements for continuous subgraph matching queries on graph-structured data - Issued 10/20/2020. Sutanay Choudhury, George Chin, JR., Khushbu Agarwal, Sherman J. Beus
  27. Choudhury S., J.A. Bilbrey, L. Ward, S.S. Xantheas, I. Foster, J. Heindel, and B. Blaiszik, et al. 2020. HydroNet: Benchmark Tasks for Preserving Long-range Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data. In Machine Learning and the Physical Sciences - Workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS).
  28. Nandanoori S., S. Kundu, S. Pal, K. Agarwal, and S. Choudhury. 2020. Model-Agnostic Algorithm for Real-Time Attack Identification in Power Grid using Koopman Modes. In IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (IEEE SmartGridComm2020).
  29. Bilbrey J.A., J. Heindel, M. Schram, P. Bandyopadhyay, S.S. Xantheas, and S. Choudhury. 2020. "A Look Inside the Black Box: Interpretation of a Continuous Filter Convolutional Neural Network (CF-CNN) for the Potential Energy Surface of Water Clusters using Graph-Theoretical Descriptors." Journal of Chemical Physics special issue on Machine Learning.
  30. Nandanoori S., S. Pal, S. Choudhury, S. Kundu, and K. Agarwal. 04/13/2020. "GridSTAGE: Spatio Temporal Adversarial Scenario GEneration framework."
  31. Bilbrey J.A., S. Choudhury, L. Ward, S.S. Xantheas, and J. Heindel. "Molecule Generation with Deep Reinforcement Learning at Exascale." Conference on Data Analytics, Santa Fe, 2020.
  32. Agarwal K., T. Eftomiv, R. Addanki, S. Choudhury, S. Tamang, and R.J. Rallo Moya. 2019. "Snomed2Vec: Evaluation of Random Walk and PoincareEmbeddings for Healthcare Tasks." In 2019 KDD Workshop on Applied Data Science for Healthcare.
  33. Chen P., S. Choudhury, L.R. Rodriguez, A. Hero, and I. Ray. 2019. "Towards Cyber-Resiliency Metrics for Action Recommendations Against Lateral Movement Attacks." In “Industrial Control Systems Security and Resiliency: Practice and Theory,” Springer, 2019.
  34. Joslyn C.A., M. Robinson, J. Smart, K. Agarwal, D. Bridgeland, A.G. Brown, and S. Choudhury, et al. "HyperThesis: Topological Hypothesis Management in a Hypergraph Knowledgebase." NIST Text Analysis Conference (TAC) 2018, Gaithersburg, Maryland.
  35. Mackey P.S., K. Porterfield, E.B. Fitzhenry, S. Choudhury, and G. Chin. 2018. "A Chronological Edge-Driven Approach to Temporal Subgraph Isomorphism." In the 2nd IEEE BigData Workshop on Graph Techniques for Adversarial Activity Analytics (best paper award).
  36. Deep Learning for Scientific Discovery, C. Corley, N. Hodas, E. Yeung, A. Tartakovsky, T. Hagge, S. Choudhury, K. Agarwal, C. Siegel, J. Daily. The Next Wave, 2018.
  37. Choudhury, S., Purohit, S., Lin, P., Wu, Y., Holder, L., & Agarwal, K. 2018. Percolator: Scalable Pattern Discovery in Dynamic Graphs. 11th ACM International Conference on Web Search and Data Mining.
  38. Sathanur, A. V., Choudhury, S., Joslyn, C., & Purohit, S. (2017). When Labels Fall Short: Property Graph Simulation via Blending of Network Structure and Vertex Attributes. 26th ACM International Conference on Information and Knowledge Management (CIKM)
  39. Purohit, S., Holder, L., & Choudhury, S. (2017, December). Application-Specific Graph Sampling for Frequent Subgraph Mining and Community Detection. In IEEE International Conference on Big Data.
  40. Choudhury S, K Agarwal, S Purohit, B Zhang, M Pirrung, W Smith, and M Thomas. 2017. “NOUS: Construction and Querying of Dynamic Knowledge Graphs.” 8th International Workshop on Data Engineering meets the Semantic Web.
  41. Zhang B, S Choudhury, M Al-Hasan, X Ning, P Pesantez, S Purohit, and K Agarwal. 2016. "Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs." In 2016 SIAM Data Mining Workshop on Mining Networks and Graphs: A Big Data Analytic Challenge.
  42. Chen PY, Choudhury S and A Hero. 2016. "Multi-centrality Graph Spectral Decompositions and their applications to Cyber Intrusion Detection." In 41st International conference on Acoustics, Speech and Signal Processing.
  43. Rodriguez LR, DS Curtis, PY Chen, S Choudhury, PL Nordquist, KJ Oler, and I Ray. 2015. "DEMO: Action Recommendation for Cyber Resilience." In 22nd ACM Conference on Computer and Communications Security (Demonstration Track).
  44. Choudhury S, PY Chen, LR Rodriguez, DS Curtis, PL Nordquist, I Ray, and KJ Oler. 2015. "Action Recommendation for Cyber Resilience." In SafeConfig 2015: Automated Decision Making for Active Cyber Defense.
  45. Choudhury S, L Holder, G Chin, K Agarwal, JT Feo. 2015. “A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs”, In Proceedings of the 18th International Conference on Extending Database Technology.
  46. Oler KJ, S Choudhury, M Halappanavar, EA Hogan, and CP Dowling. 2015. "A Graph Based Approach to Role Mining from Network Traffic." FloCon 2015, Portland, OR.
  47. Chin G, Jr, S Choudhury, and K Agarwal. 2014. "StreamWorks –A System for Real-Time Graph Pattern Matching on Network Traffic." FloCon 2015, Portland, OR.
  48. Chin G, Jr, S Choudhury, JT Feo, and L Holder. 2014. "Predicting and Detecting Emerging Cyberattack Patterns Using StreamWorks." In Proceedings of the 9th Annual Cyber and Information Security Research Conference, 2014.
  49. Ray A, L Holder, and S Choudhury. 2014. "Frequent Subgraph Discovery in Large Attributed Streaming Graphs." Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications / Special Issue of Journal of Machine Learning Research.
  50. Lieberman M, S Choudhury, M Hughes, D Patrone, S Hider, C Piatko, M Chapman, JP Marple, and D Silberberg. 2014. "Parasol: An Architecture for Cross-Cloud Federated Graph Querying." In 2014 ACM SIGMOD Workshop on Data analytics in the Cloud.
  51. Weaver J., Castellana V. G., Morari A., Tumeo A., Purohit S., Chappell A., Choudhury S., Schuchardt K., Feo J., “Toward a Data Scalable Solution for Facilitating Discovery of Science Resources”, Journal of Parallel Computing , 2014.
  52. Hogan EA, CA Joslyn, S Choudhury, and PSY Hui. "Statistical and Hierarchical Graph Analysis for Cyber Security", SIAM Discrete Mathematics, 2014.
  53. Chin G, Jr, Choudhury S., Holder LB and Feo JT., “Predicting and Detecting Emerging Cyberattack Patterns Using StreamWorks”, 9th Cyber and Information Security Research Conference, 2014.
  54. Morari A.; Castellana V. G., Villa O., Tumeo A., Weaver J., Haglin D., Feo J. and Choudhury S., “Scaling Semantic Graph Databases in Size and Performance”, IEEE Micro, 2014.
  55. Kalaynaraman A., Lu H., Halappanavar M., Choudhury S., “Parallel Heuristics for Scalable Community Detection”, 28rd IEEE International Parallel and Distributed Processing Workshops, 2014.
  56. Chappell A., Choudhury S., Feo J., Haglin D., Morari A., Purohit S., Schuchardt K., Tumeo A., Weaver J., and O. Villa. 2013. Toward a data scalable solution for facilitating discovery of scientific data resources. In Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems (DISCS-2013).
  57. Hogan E., Hui P., Choudhury S., Halappanavar M., Oler K., Joslyn C., “Towards a Multiscale Approach to Cybersecurity Modeling”, IEEE International Conference on Technologies for Homeland Security (HST), 2013.
  58. Joslyn C., Choudhury S., Haglin D., Howe B., Nickless, B., Olsen B., “Massive Scale Cyber Traffic Analysis: A Driver for Graph Database Research”, ACM SIGMOD Workshop on Graph Data Management Experiences and Systems, 2013.
  59. Halappanavar M., Choudhury S., Hui P.Y., Hogan E., Johnson J.R., Ray I., and Holder L., “A Network-of-Networks Framework for Cyber-Security”, IEEE Intelligence and Security Informatics, 2013.
  60. Choudhury S., Holder LB., Chin G, Jr., and Feo JT., “Fast Search for Multi-Relational Graphs”, ACM SIGMOD Workshop on Dynamic Network Management and Mining, 2013.
  61. Choudhury S., Holder LB., Chin G, Jr., Ray A., Beus S. and Feo JT., “StreamWorks – a System for Dynamic Graph Search”, ACM SIGMOD international conference on Management of data, 2013.
  62. Chin G, Jr, Marquez A., Choudhury S., and Feo JT., “Scalable Triadic Analysis of Large-Scale Graphs: Multi-Core vs. Multi-Processor vs. Multi-Threaded Shared Memory Architectures”, 24th Intl. Symposium on Computer Architecture and High Performance Computing, 2012.
  63. Choudhury S., Holder L., Chin G. Jr. and Feo JT., "Large-Scale Continuous Subgraph Queries on Streams", Workshop on High-Performance Computing meets Databases, Co-located with Supercomputing 2011, Seattle, WA, November 2011.
  64. Chin G., Choudhury S., Kangas L., McFarlane S., Marquez A., “Evaluating In-Clique and Topological Parallelism Strategies for Junction-Tree based Bayesian Inference”, Proceedings of 25rd IEEE International Parallel and Distributed Processing Workshops, 2011.
  65. Book chapter: “Applications in Data-Intensive Computing”, Anuj R. Shah et al., Advances in Computers, Volume 79 (2010).
  66. Chin G., Choudhury S., Kangas L., McFarlane S., Widener K., “Detecting Fault Conditions in Distributed Sensor Networks Using Dynamic Bayesian Networks”, Proceedings of 1st Science Team Meeting, Atmospheric Systems Research Program, 2010.
  67. Choudhury S., Halter T. and Critchlow T. - "Search Techniques for Atmospheric Data Sets", Proceedings of 19th Atmoshperic Radiation Measurement (ARM) Science Team Meeting, 2009.
  68. Chin G., Marquez A., Choudhury S. and Maschoff K. "Implementing and Evaluating Multithreaded Triad Census Algorithms on the Cray XMT ", Proceedings of 23rd IEEE International Parallel and Distributed Processing Workshops, 2009.
  69. Choudhury S. "A Web-Based Interface for ARM Data Stream Dependency Application", Proceedings of 18th ARM Science Team Meeting, 2008.
  70. Choudhury S. "ACRF Data Acquisition and Monitoring Software Suite for Instrument PCs", Proceedings of 18th ARM Science Team Meeting, 2008 (Received Chief Scientist’s award for best posters).
  71. An open source implementation of PakBus™ protocol. Choudhury S. and Macduff M., "Linux based software for data collection from Campbell Data Loggers" - Invention Report, Pacific Northwest National Laboratory, 15895-E, 2008.
  72. Choudhury S., Chandrasekar V. “Application of Wideband Reception and Processing in a Dual Polarization radar with Dual Transmitter Systems”, Journal of Atmospheric and Oceanic Technology, 2007
  73. Choudhury S., Chandrasekar V. “Practical Aspects of Wideband Processing and Reception in Dual-Polarization Weather Radars”, AMS 31st International Conference on Radar Meteorology, 2003