Data Science: About

Data Science: About

Data science - a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.

Disciplines involved: information science, and computer science, mathematics, statistics.

Data science is now often used interchangeably as a buzzaord for earlier field like business analytics, business intelligence, predictive modeling, and statistics.

Data Science Stages

  1. Capture Data
    • Data acquisition
    • Data entry
    • Signal reception
    • Data extraction
  2. Maintain Data
    • Data warehousing
    • Data cleansing
    • Data staging
    • Data processing
    • Datsa architecture
  3. Process
    • Data mining
    • Clustering / classification
    • Data modeling
    • Data summarization
  4. Communicate
    • Data reporting
    • Data visualization
    • Business intelligence
    • Decision making
  5. Analyze
    • Exporatory / Confirmatory
    • Predicive analysis
    • Regression
    • Text mining
    • Qualitative analysis

Three main programming skills involved:

  1. R
  2. Python
  3. SQL

Second tier technology skills involved:

  1. Apache Hadoop
  2. Hadoop
  3. NoSQL
  4. SAS
  5. AI
  6. Machine learning
  7. MATLab
  8. Cloud computing
  9. Apache Spark
  10. GitHun
  11. Tableau
  12. iPython notebooks
  13. Excel

Data Science Job Titles:

  • Data Sicentist
  • Data Analyst
  • Data Engineer

Data skills:

  • Data Munging - the data wrangling that brings together data into cohesive views, as well as the cleaning up data so that it is polished and ready for usage