This space contains links to research papers, event sources and useful resources on data science and big data. Machine learning, and important part of data science, is covered here. Before we get much further, it's worth clarifying some terminology.
The term big data is somewhat ambiguous; here we assume that this refers to a lot of data (typically terabytes, but not necessarily), being generated faster than can fit in conventional memory, which may be a mix of structured and unstructured data (although most of the world's data today is unstructured. For this reason big data is normally characterised by the three 'Vs': volume, velocity, and variety; although these terms are not precise.
Data science is an evolving field, comprising a number of principles, processes, and algorithms for extracting (typically non-obvious) patterns from large datasets. As a general rule if these patterns can be easily differentiated by humans then we don't need data science for that particular use case. Data Science is a broad discipline incorporating several fields, including statistics, data mining, and machine learning. The objective of data science is to improve decision making based on actionable insights from the results of analysing large datasets.
Analysis of a large dataset may reveal many interesting patterns, and by actionable we mean that these insights should be capable of being acted upon (for example to further optimise a process by making necessary changes). Therefore we must extend the earlier definition of data to include the requirement that the content of a data feed should have some potential value once analysed. You could argue that at that point data becomes information - otherwise the data is essentially noise, and of little use in data science.
Related Posts
Five Myths About Data Science | June 2019
Resources
Data science is a broad topic, which overlaps with several areas such as conventional statistics, machine learning, and big data analytics. Below are a number of links to places you might want to start exploring.
'Data Science' | Wikipedia
'Machine Learning' | Wikipedia
'Supervised Learning' | Wikipedia
'Unsupervised Learning' | Wikipedia
'Clustering' | Wikipedia
'Structured Prediction' | Wikipedia
'Dimensionality Reduction' | Wikipedia
'Anomaly Detection' | Wikipedia
'Reinforcement Learning' | Wikipedia
'Artificial Neural Network' | Wikipedia