Resources for Building Your Data Science Skillset
This list of resources is designed to give you material to study from, or groups to study with. The list will be updated with any resources you send us. Current resources include:
HPC and Batch Computing
- How to use NIH’s Biowulf Compute Cluster - a self-paced course
Scripting and Programming Languages
- Software Carpentry lessons https://software-carpentry.org/lessons/
Python
- Intro to Python from Github https://github.com/python
- Introduction to Programming (with Python) - a webinar from NIAID https://bioinformatics.niaid.nih.gov/resources#70.3.2
R / RStudio
- Introduction to R (RStudio) - https://education.rstudio.com/learn/
Bash / Shell Scripting / UNIX/Linux Command Line
- The UNIX (and Linux) Shell for Novices - http://swcarpentry.github.io/shell-novice/
Data Visualization
Matplotlib in-depth user guides: beginner, intermediate, and advanced sections, plus specific topics. https://matplotlib.org/tutorials/index.html
Study Groups and Special Interest Groups
- Cloud 4 Bio, led by DCEG’s Jonas de Almeida - Weekly hackathon on Cloud Services and Web Applications for Cancer Research https://cloud4bio.github.io
- NIH Data Science Slack group
General Tutorials and Overviews
- Toward Data Science - tutorials and overviews https://towardsdatascience.com
- Biostars - “Bioinformatics Explained” https://www.biostars.org/
- Dataquest.io - interactive tutorials https://www.dataquest.io/
Note that CCR’s Bioinformatics Training and Education Program (BTEP) has licenses for Biostars and Dataquest.io available for CCR staff. If you are interested in these, but are not in CCR, please contact us and we will arrange licenses for you: NCICBIITDataScienceTraining@mail.nih.gov
NIH Listservs
Note: You must register for an account before subscribing to these.
- BIOINFO-GENERAL-NCI
- BIOINFORMATICS-USER-FORUM
- BIOINFORMATICS-SIG-L
- NCI-LINUX-DESKTOP
- NIH-DATASCIENCE-L
- AI
CBIIT Cancer Data Science Seminar Series https://datascience.cancer.gov/news-events/events/data-science-seminar
Resources for Intermediate or Advanced Learners
Machine Learning
-
Github tutorial on machine learning https://github.com/topics/machine-learning
Machine Learning Mastery Tutorials https://machinelearningmastery.com/start-here/