Programming Languages for Data Professionals

In general, anyone working with computers should have a decent if not prolific understanding of a shell scripting language; the most common and well known is Bash.

While Python and R dominate much of the data science world, there are a few notable others that get used:

  • Julia (the JU in Jupyter (Notebooks) stands for Julia)

  • Java (especially for cybersecurity projects)

  • Javascript (for web based machine learning especially)

On the data engineering side, general purpose programming languages predominate. These might include:

  • The C family of languages (C/++, C#, etc) for secure, embedded, or otherwise low level systems

  • Javascript for front end design

  • Java

  • Python

  • Scala

SQL, of course, everyone interacts with one way or the other in the data professions.

At the end of the day, programming languages are a personal choice, and often times the best language is the one you are most comfortable with. Although there are design choices baked into these languages that make them easier for some tasks, sometimes the differences between them are so slim that it doesn’t seem to matter. In the software engineering world, there is an endless debate between Java or C#, which often share nearly the same syntax or names for their calls! In the data science world, Python and R are debated as well.