Welcome

Welcome everyone. And congratulations on choosing to focus on picking up data science skills in your time away from the lab or office. As you know, cancer research is becoming more and more of an information science, and your ability to make sense of large and diverse data sets will be extremely valuable now and into the future.

In this webinar, we will try to quickly give you an overview of the new NCI Data Science Learning Exchange and give you some tips on how to get the most out of it. We will also leave time for a Q&A so that you can ask any questions or even give us suggestions about how we can improve this program for you.

Very important: The NCI Data Science Learning Exchange is built on a peer-to-peer mutual support model. Some of you are just getting started learning data science skills, but others have already started learning. And others may already be very advanced. If you have skills in any area of data science, your ideas for helping others get started and make progress are a very important part of this program.

Overview

Intent

  • The goal of the NCI Data Science Learning Exchange is to give you an opportunity to use your social distancing time to learn valuable data science skills. We are trying to do this by leveraging the large number of data science learners to help each other pick up and improve on data science skills.

Structure

  • The program works by providing a central online location for sharing good learning resources and tools for learners to ask questions and guide each other in learning.
  • This central location allows anyone to chat, add tips to a wiki, or share files to help each other learn.
  • Most important: you will have to use your own computer to follow a study guide and practice; but the Learning Exchange will provide you with peers to help you progress.

Resources

Website: https://cbiit.github.io/p2p-datasci

  • You can always refer to this website for information on how to get started. This will contain links to all resources as well as contact information and some description of common topics covered in the Learning Exchange.
  • If you bookmark anything; this is it.

Microsoft Teams site

  • Called “NCI Data Science Learning Exchange” (link on the github.io website)
  • This is the central location for interaction among staff learning data science.
  • We have created channels for the most common data science topics people said they are interested in. Think of these as study groups.
  • Chat allows you to ask or answer questions and share tips on learning and practicing; Wikis allow you to add tips or list resources, or read what others have posted; and Files allows you to share files with others.
  • You must be on NIH VPN to log into the Teams website. If this is a problem for you, let us know and we can help.

Tutorials and Guides

  • Links to popular tutorials and guides will be provided in the wikis within each channel (study group).

Dataquest.io and Biostar Handbook Licenses

  • These are both used by NCI’s Bioinformatics Training and Education Program and elsewhere in NIH, so they put you on a good, common ground with other learners. Request one here.

Your Support Team at the NCI Data Science Learning Exchange

  • composed of CBIIT and FNL staff. We are here to make the program work and to update all resources and support your use of the exchange however we can. Reach out to us with any questions or suggestions.

Most important resource: your fellow NCI staff!

  • You are strongly encouraged to help each other through Microsoft Teams.
  • Blogs - we are asking for volunteers to write short blog articles with tips and guidance for people getting started with data science.
  • Webinars led by you! If you would like to present on any topics, let us know and we can arrange a Webex meeting for you to talk, share your slides, do live demos, and interact with others.

Tips for Success

Start by trying to create a “learning path” for yourself. Initially, your learning path may be as simple as identifying the different skills you want to learn and putting them in sequential order. Others in the Learning Exchange can help you with this.

The most important underlying skill you should be trying to learn is how to be self-sustaining in learning more. The best programmers and data scientists don’t know everything – but they keep a list of the resources they can go to to figure out how to do things (e.g. stackoverflow.com).

Do not be afraid by the massive number of skills, tools, and domains that are considered part of “data science”. Nobody knows everything. Stick with your learning path and keep focusing on progress.

Do not be afraid to ask any question. “I have absolutely no idea where to start. Where do I start?” is a perfectly legitimate (and common) question.

Adopt the mindset that you will always be learning data science. It never ends, even for the most experienced data scientist. Expect that learning will be a standard part of being a data scientist. But don’t worry: learning gets a lot easier and more fun once you become comfortable with the fundamentals.

Q&A