From data wrangling to analysis and visualization, Penn offers a wide variety of resources and research opportunities to help one get started. Below lists some of the resources available.

Labs and Centers Using Data Science:

Groups for the Penn Community:

  • Penn Data Science Group (PDSG) is a social and professional community for students to exchange ideas, connect with industry professionals and develop the skills needed to have a successful data science career
  • Penn Libraries Mapping and GIS club (PennMGIS) meets on the 3rd Thursday of every month and features workshops, talks, and discussions during the first hour, followed by co-working and collaborative projects in the second hour. All at any level or skill are welcome.
  • Python Users Group (PUG@Penn) meets on the 1st Thursday of every month and is a drop-in, informal, collaboratively-oriented group that can help you get started on a new project, troubleshoot a current one, or discuss and learn about useful Python tools, applications, and libraries. All at any level or skill are welcome.
  • R Penn Group meets on the 2nd Thursday of every month and is a group that meets to discuss, learn, and collaborate on topics and projects related to statistics, mapping, data wrangling, visualization, and analysis with R programming language through hands-on problem-solving. These two hours will be reserved for co-working and collaborative study with your peers. Bring your own laptops. Students, staff, and faculty of University of Pennsylvania at any level of skill are welcome in R Penn Group.

Penn Libraries Training Resources:

  • Data services for extracting, scraping and managing data, working with GIS and mapping data, to helping work with and create visualizations. They also provide services to find datasets that others have collected.
  • R/RStudio guide and resources
  • Python guide and resources
  • Dataquest: An online self-paced learning platform to pick up skills on data visualization, data scrapping, R, Python, and more! Penn has a premium license for this. To activate your account, you will need to complete this form.
  • The Data Planning & Management provides a list of best practices in data management from file organization to security and archiving data. Individual appointments for assistance are available.
  • A guide to working with GIS data, ArcGIS, mapping, and statistical analyses used.
  • How to find and use population data from the U.S. Census Bureau

Data Events and Workshops

  • Earth Week workshops on working with data relating to environmental issues
  • 6-week workshop walking through life cycle of data sourcing to storytelling
  • Recordings of data workshops and tutorials
  • Weekly data workshops are hosted by Penn Libraries’ Research Data & Digital Scholarship team
  • Faculty and staff can request custom workshops for an instructor to teach a session on the following topics:

data ethics and data privacy, data management and data management plans (including funder mandates), data visualization, digital exhibit and publication platforms, fundamentals of research computing (R, Python, etc.), mapping and GIS, natural language processing and machine learning, open science, FAIR principles, ORCiD IDs, qualitative data analysis (Atlas.ti; NVivo), research communication, research sharing, text mining and computational text analysis, web scraping


  • Girmaye Misgna: provides research consultation and training services to faculty, staff, and students who are engaging with spatial methods, bringing deep technical expertise in both GIS and Mapping application development to that work
  • Manuel de la Cruz Gutierrez: Director of Data & Innovation Services at the Biotech Commons
  • Nicky Agate: Snyder-Granader Assistant University Librarian for Research Data & Digital Scholarship, supporting faculty and students whose research makes intensive use of digital methods and tools, including research data across the discipline
  • Jajwalya Karajgikar: Applied Data Science Librarian, engages with researchers across the disciplines interested in employing techniques for data storytelling, natural language processing, computational social sciences, data visualization, network analysis, and text mining. She works with campus partners to establish foundational programming in research computing, data literacy, and data ethics.

Have a specific data-related question and need immediate assistance? Send an email to