The Starter Guides

The Starter Guides are handcrafted by Data Mine staff to help you deal with data. Here you will find a brief synopsis on all the major data related topics and tools, as well as some less common ones; books, articles, videos and more to learn more about the topic from; as well as free code samples demonstrating technical topics usually in the form of Notebooks and/or containers.

How to Use

The Starter Guides are meant to be use-it-when-you-need-it, so if you know what topic you are looking for, dig right in! Otherwise, this page can help you get started.

If you are brand new to dealing with data, start with learning about the data modeling process. If you need to gather your own dataset, web scraping or searching for a dataset is the next step. Once you have data, you might need to select and perform an analysis technique.

For projects with a data engineering focus, check out our Containers, SQL, or SLURM guides.

Data Engineering Vs. Data Science: What’s the Difference?

If you are relatively new to dealing with data, refer to the table below to get a feel for the difference between data engineering and data science.

Discipline Data Engineering Data Science

Languages Used

Any; General Purpose Languages Most Common, like Python, Java, C++

Python, R

What They Do With Data

"Move Data Around"; Collect, Organize, Set Up Databases; Set up Cloud Systems

"Make Sense of Data"; Analyze, Train Models, Make Visualizations

Common Backgrounds

Computer Science, IT

Math, Statistics, Computer Science

Common Tools

Hadoop, NoSQL, Spark, Postgresql, Kubernetes, Docker

MapReduce, Keras, PyTorch, Plotting Packages like GGPlot, JAX

While there are debates about whether data science is data engineering and vice versa, or whether they even belong on the same guide together, they are both dependent on each other in some form or fashion, and so we included both as separate categories. For some organizations, people do both! For others, they have multiple departments that share all of those responsibilities; still others draw a much starker line between the two than we have here. One thing is clear: dealing with data always depends on who you deal with, and the jury is still out on the right way to categorize these skills. Nonetheless they overlap in many areas.
Data professionals of all stripes should know a mix of data engineering and data science to be successful at their jobs.

I Don’t See The Topic I Am Looking For

Try the search bar in the top right corner of the Examples Book; it searches across our entire site.