Big Data vs. Conventional Data

 



New, innovative, and more efficient technologies have enabled the management of big data for analytics to extract meaningful insights for decision-making (Minelli et al., 2013), emphasize the many differences between big data and conventional data. 

Edd Dumbill as cited in Minelli et al. (2013), defines big data as “data that becomes large enough that it cannot be processed using conventional methods,” and Gartner’s definition in 2001 reads “Big data is data that contains greater variety arriving in increasing volumes and with ever-higher velocity” (Treehouse, 2022), featuring big data’s three V’s: variety, volume, and velocity (Minelli et al., 2013). Kumar et al. (2021) explains that with big data, it is possible to store large volumes of data, but not in a conventional data setup. Conventional data reflects a centralized database architecture, and big data is found in distributed databases. The data sources are many when dealing with big data, but severely limited with conventional data. Data integration of the conventional type is an easy process, but it is more complex when dealing with big data as huge amounts are generated quickly and often (Kumar et al., 2021). 

Other differences between big data and conventional data include flexibility as it refers to schema. Big data uses a dynamic schema, whereas a fixed schema is associated with conventional data. Also, big data enables real-time analytics, which is not possible when dealing with conventional data as it relies mostly on historical records (Treehouse, 2022).

Big data’s ability to enhance data analytics’ approaches allows for data exploration and development of meaningful questions as data analysis is taking place. With conventional data, reports are structured to answer pre-established questions that offer little insight (Minelli at al., 2013; Treehouse, 2022). 


There are several public sites offering free access to large datasets. Kaggle, for example, requires users to register as it is more of a community rather than a search engine. Not only are there countless dataset topics available for download, but the site holds machine learning (ML) competitions and offers educational tools for learning data analysis and artificial intelligence (AI) (Hillier, 2022). Just like Kaggle, Data.gov offers free access to large datasets but does not require users to register. The federal government compiles data in diverse topics such as crime, climate change, and others, and offers a user-friendly interface that allows users to drill-down based on different attributes, including federal, state, county, and city categories or levels (Hillier, 2022).   





References 

Hillier, W. (2022, January 12). 10 great places to find free datasets for your next project. Career Foundry. https://careerfoundry.com/en/blog/data-analytics/where-to-find-free-datasets/ 

Kumar, K. S. A., Hababa, S. M., Worku, B., Tadele, G. Mengistu, Y. G., & Prasad, A. Y. (2021). Big data characteristics, classification and challenges – A review. Turkish Journal of Computer and Mathematics Education, 12(12), 4236-4243. ProQuest Central. 

Minelli, M., Chambers, M., & Dhiraj, A. (2013). Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses (1st ed.). John Wiley & Sons. 

Treehouse Technology Group. (2022). Big data vs. traditional data: What’s the difference? https://treehousetechgroup.com/big-data-vs-traditional-data-whats-the-difference/

Comments

Popular posts from this blog

Think Tanks

The Rise of Data Scientists

Big Data Analytics in the Banking Industry