Today, Data, AI, and digital systems are increasingly important in today's applications. However, 'learning the theory' and 'making it work' are not the same. It is essential to actually implement useful systems in real life.
Data-X bridges the gap between theory and practice, by combining state-of-the-art tools, innovation processes. Data-X is powered by the Innovation Engineering framework.
Data-X is a lab for students, researchers, companies, and leading global institutions:
Data-X is for anyone interested in careers, new ventures, and innovative projects in areas related to data science and information technology systems.
Taking a purely theoretical course is not enough. Students often take course after course in technical subject areas without being able to implement, apply, and/or make an innovative impact. Data-X places a real life innovative emerging technology project at the center of a learning experience that includes powerful tools, theory, and innovation behaviors and mindset. Data-X also builds on Innovation Engineering, a powerful framework for guiding innovation projects.
Bring your own ideas into an integrative project that can help students pursue new opportunities ranging from starting a new venture to interviewing for industry positions.
Learn more from these articles:
To work with and contribute to Data-X, check out our Collaborate page.
Data-X was created in 2015 by Sutardja Center faculty director and chief scientist, Ikhlaq Sidhu. Ikhlaq wanted to create a data science course that equipped students with the most relevant skills and approaches that would help them get started with creating innovative data science projects. His idea was to teach students the most relevant industry programming skills, the most commonly used algorithms, and pair this with an experiential data science project so that students would learn to how to execute data science projects in real life.
Since 2015, more than 1,500 students have taken Applied Data Science with Venture Applications, or Data-X for short.
"The opportunity to dive into extensive projects with diverse teams, getting involved with industry mentors, the openness and flexibility of the Profs and GSIs makes the course a must have for everyone interested in data analytics. My two-semester long involvement with the class and the Profs was a significant contributing factor to me being a Data Scientist today."
"I think this class is so awesome because it teaches the tools and concepts that are most commonly used in workplace teams that are involved with data science and applied machine learning."
"135 has to be my favorite of the ML classes at Berkeley. It covers A TON of content. The course is very application-focused and yet explains the general idea behind concepts."
-- Victor Fang, Ph.D. , CEO of AnChain.ai"DataX is a very rare data science course that prepares students ready to be real world data scientists. AnChain.ai has hired great talents from the DataX course, and we are excited to see they are applying the DataX philosophy to challenging machine learning problems, not just from how to code up deep learning SGD solver, but also from the business and product perspective, "
Lecture Materials | Homework | Resources | |||
00: Getting Started | |||||
010 | Introduction Basics | Video Slides Code | Code Video | References Page | |
020 | Project Guidance | Video Slides Code | Code Video | Project Guidelines | |
030 | Install Instructions | Video Slides Code | Code Video | References Page | |
01: Fundamentals | |||||
100A | Predication and Linear Regression Part I | Video Slides Code | Code Video | HW | NA |
100B | Predication and Linear Regression Part II | Video Slides Code | Code Video | HW | NA |
110 | NumPy | Video Slides Code | Code Video | HW-NumPy | Introduction to NumPy 101 NumPy Exercises NumPy Cheatsheet |
120 | Pandas | Video Slides Code | Code Video | HW-Pandas | Introduction to Pandas 10 Minutes to Pandas Pandas Cheatsheet |
130 | X Data Visualization | Video Slides Code | Code Video | HW | NA |
140 | X Logistic Regression and SKlearn (Empty) | Video Slides Code | Code Video | HW | List of Resources |
160 | X Predictive Model (Titanic): Putting it together | Video Slides Code | Code Video | HW | NA |
170 | X ML Algorithm Overview | Video Slides Code | Code Video | HW | NA |
180 | X Cross-Validation and Regularization | Video Slides Code | Code Video | HW | NA |
X | X | Video Slides Code | Code Video | HW | NA |
02: Data Signals | |||||
200 | X Correlation | Video Slides Code | Code Video | HW | NA |
215A | Time Series | Video Slides | Code Video | HW | NA |
215B | Time Series | Video Code | Code Video | HW-TS | NA |
220 | X Decision Trees, Information Theory | Video Slides Code | Code Video | HW | NA |
250 | X Spectral Signals | Video Slides Code | Code Video | HW | NA |
03: Data Handling | |||||
310 | Web Scraping | Video Slides Code | Code Video | HW | NA |
320 | X Flask | Video Slides Code | Code Video | HW | NA |
04: Deep Learning | |||||
410 | Intro to Tensor Flow | Video Slides Code | Code Video | HW | NA |
420 | X Neural Networks | Video Slides Code | Code Video | HW | NA |
430 | X Convolution Neural Networks | Video Slides Code | Code Video | HW | NA |
05: Natural Language Processing | |||||
500 | Text Processing | Video Slides CodeX | Code Video | HW | NA |
510 | X Feature Engineering & Text Representation | Video Slides Code | Code Video | HW | NA |
520 | X Learning Models | Video Slides Code | Code Video | HW | NA |
06: Data-X Library | |||||
610 | Stock Market Data and Quotes | Video Slides Code | Code Video | HW | NA |