What skills and knowledge should a Data Scientist have

In general, a Data Scientist should have

  • Strong background in Mathematics (Calculus, Linear Algebra), to understand mathematical notations and able to transform them to code
  • Strong background in Probability, and Statistics (Basic probability, CI, Hypothesis Testing, A/B Testing, Regression, GLM, etc.)
  • Data Structures & Algorithms: Demonstrated via one of programming languages such as Python or Java
  • Big Data: Hands-on experience with HPC/AWS/GCP, and Apache Spark/Hadoop/Kafka for Big Data management, Hive/Pig for data processing/ETL
  • Machine Learning & Data Mining: Knowledge of classical ML algorithms and hands-on skills with packages such as Scikit-learn,  Numpy, Scipy, etc. as well as Deep Learning via TensorFlow, and Keras, and ML with Big Data using MLlib
  • Databases: Relational database models; Hands-on with typical relational database software (Oracle, MySQL, MS SQL Server), and NoSQL (HBase, Cassandra, MongoDB, etc.)